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Abstract 

The threshold degree of a Boolean function /: {0, 1}" {— 1,+1} 
is the least degree of a real polynomial p such that /(x) = sgn p{x). We 
construct two halfspaces on {0, 1 }" whose intersection has threshold degree 
©(^w), an exponential improvement on previous lower bounds. This solves 
an open problem due to Klivans (2002) and rules out the use of perceptron- 
based techniques for PAC leaming the intersection of two halfspaces, a cen- 
tral unresolved challenge in computational learning. We also prove that the 
intersection of two majority functions has threshold degree Q(logn), which 
is tight and settles a conjecture of O'Donnell and Servedio (2003). 

Oiu' proof consists of two parts. First, we show that for any nonconstant 
Boolean functions / and g, the intersection f{x) A g{y) has threshold de- 
gree 0{d) if and only if || / — f ||oo + 11^ — G||oo < 1 for some rational 
functions F, G of degree 0{d). Second, we settle the least degree required 
for approximating a halfspace and a majority function to any given accuracy 
by rational functions. 

Oiu' technique further allows us to make progress on Aaronson's chal- 
lenge (2008) and contribute strong direct product theorems for polynomial 
representations of composed Boolean functions of the form F{f\, fn). In 
particular, we give an improved lower bound on the approximate degree of 
the AND-OR tree. 
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1 Introduction 



Representations of Boolean functions by real polynomials play an important role 
in theoretical computer science, with applications ranging from complexity theory 
to quantum computing and learning theory. The surveys in l?! |40l [13] |43l offer 
a glimpse into the diversity of these results and techniques. We study one such 
representation scheme known as sign-representation. Specifically, fix a Boolean 
function / > {— l,+l}for some finite set X o M", such as the hypercube 
X = {—1, +1}". The threshold degree of /, denoted deg±(/), is the least degree 
of a polynomial p{x\, . . . ,Xn) such that 

f{x) = sgnp(x) 

for each x g X. In other words, the threshold degree of / is the least degree of a 
real polynomial that represents / in sign. 

The formal study of this complexity measure and of sign-representations in 
general began in 1969 with the seminal work of Minsky and Papert [30|, who 
examined the threshold degree of several common functions. Since then, sign- 
representations have found a variety of applications in theoretical computer sci- 
ence. Paturi and Saks |^5l and later Siu et al. {4T\ used Boolean functions with 
high threshold degree to obtain size-depth trade-offs for threshold circuits. The 
well-known result, due to Beigel et al. 191, that PP is closed under intersection is 
also naturally interpreted in terms of threshold degree. In another development, 
Aspnes et al. [6] used the notion of threshold degree and its relaxations to obtain 
oracle separations for PP and to give an insightful new proof of classical lower 
bounds for AC". Krause and Pudlak [i26ll27l used random restrictions to show that 
the threshold degree gives lower bounds on the weight and density of perceptrons 
and their generalizations, which are well-studied computational models. 

Learning theory is another area in which the threshold degree of Boolean 
functions is of considerable interest. Specifically, functions with low threshold 
degree can be efficiently PAC learned under arbitrary distributions via linear pro- 
gramming. The current fastest algorithm for PAC learning polynomial-size DNF 
formulas, due to Klivans and Servedio |21|, is an illustrative example: it is based 
precisely on an upper bound on the threshold degree of this concept class. 

The threshold degree has recently become a versatile tool in communication 
complexity. The starting point in this line of work is the Degree/Discrepancy The- 
orem ll4Tll42l . which states that any Boolean function with high threshold degree 
induces a communication problem with low discrepancy and thus high communi- 
cation complexity in almost all models. This result was used in BTIl to show the 
optimality of AUender's simulation of AC" by majority circuits thus solving an 
open problem of Krause and Pudlak [26]. Known lower bounds on the threshold 
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degree have played an important role in recent progress ll44l |38]| on unbounded- 
error communication complexity, which is considerably more powerful than the 
models above. 

In summary, the threshold degree has a variety of applications in circuit com- 
plexity, learning theory, and communication complexity. Nevertheless, analyzing 
the threshold degree has remained a difficult task, and Minsky and Papert's sym- 
metrization technique from 1969 has been essentially the only method available. 
Unfortunately, symmetrization only applies to symmetric Boolean functions and 
certain derivations thereof. In a recent tutorial presented at the FOCS'08 con- 
ference, Aaronson ^ re-posed the challenge of developing new analytic tech- 
niques for multivariate real polynomials that represent Boolean functions. We 
make significant progress on this challenge in the context of sign-representation, 
contributing a number of strong direct product theorems for the threshold degree. 
As an application, we construct two halfspaces on {0, 1 }" whose intersection has 
threshold degree Q(v^), which solves an open problem due to Klivans [T9*] and 
rules out the use of perceptron-based techniques for PAC learning the intersection 
of even two halfspaces (a central unresolved challenge in computational learning 
theory). We give a detailed description of our results in Sections [TTT]-[T3| followed 



by a discussion of our techniques in Section 1.4 



1.1 Results for general compositions 

Our first result is a general direct product theorem for the threshold degree of 
composed functions. 

Theorem 1.1 (Threshold degree). Consider functions / : X — > {—1,-1-1} and 
F: {-1,-1-1}*^^ {-\,+\}, where X is a finite set. Then 

deg±(F(/, ...,/)) ^ deg±(f ) deg±(/). 

Theorem 1 1 . 1 1 gives the best possible lower bound that depends on deg±(F) 
and deg± (/) alone. In particular, the bound is tight whenever F = PARITY or 
/ = PARITY. To our knowledge, the only previous direct product theorem of 
any kind for the threshold degree was the XOR lemma in 1 33 J , which states that 
the XOR of k copies of a given function / : X — > {—1,-1-1} has threshold degree 
^deg±(/). 

We are able to generalize Theorem |1.1| to the notion of e -approximate degree 
deg^(F), which is the least degree of a real polynomial p with \\F — p\\ao ^ e. 
This notion plays a fundamental role in complexity theory, learning theory, and 
quantum computing and was also re -posed as an analytic challenge in Aaronson's 
tutorial [2]. We have: 
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Theorem 1.2 (Approximate degree). Fix functions f:X — > {— 1,+1} and 
F: {-1,+!}*^ {-1,+1}, where X c M" is a finite set. Then for < e < 1, 



deg,(F(/,...,/))^deg,(F)deg±(/). 



Again, Theorem 1.2 gives the best lower bound that depends on deg^(F) and 
deg± (/) alone. For example, the stated bound is tight for any function F when / = 
PARITY. In Section |3.1[ we prove various other results involving bounded-error 
and small-bias approximation, as well as compositions of the form F(f [,..., f^) 
where fi, ■ ■ ■ , fk may all be distinct. 



We use Theorem 1.2 to obtain an improved lower bound on the approximate 



degree of the well-studied AND-OR tree, given by 

n n 

f{x) = \l l\xij. (1.1) 

/ = 1 ./ = 1 

Prior to this work, the best lower bound was 

Q(-„o.66...^^ due to Ambainis Q. 
Preceding it were lower bounds of Q.{y/n) due to Nisan and Szegedy fSSl and 
Q.{y/n \ogn) due to Shi [46|. We improve the standing lower bound from 
Q(^^o.66...^ to Q(n° ''^), the best upper bound being 0{n) due to H0yer et al. lfT6ll . 



Theorem 1.3 (AND-OR Tree). Define f : {-l,-hl}"' ^ {-1,+1} by ( fLlj ). 
Then 

degi/3(/) = Q(n°-^5). 



Furthermore, the proof of Theorem 1.3 is simpler and more modular than the pre- 
vious lower bound 0, which was based on the collision and element distinctness 
problems. 



1.2 Results for specific compositions 

While Theorems 1 1 . 1 1 and 1 1 .2| give the best lower bounds that depend on deg±(F), 
deg±(/), and deg^(F) alone, much stronger lower bounds can be derived in 
some cases by exploiting additional structure of F and /. Consider the special 
but illustrative case of the conjunction of two functions. In other words, we are 
given functions / : X — > {—1, -|-1} and g: Y ^ {—1,-1-1} for some finite sets 
X,YoW and would like to determine the threshold degree of their conjunction, 
(/ g)(^^ y) = fi^) ^ g(y)- A simple and elegant method for sign-representing 
f /\ g, due to Beigel et al. [9|, is to use rational approximation. Specifically, let 
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Pi(x)/qi{x) and p2 (j ) /^2 (j ) be rational functions of degree d that approximate / 
and g, respectively, in the following sense: 



max 



fix) 



Pl{x) 



qi(x) 



+ max 



Piiy) 



qiiy) 



< 1. 



(1.2) 



Letting —1 and +1 correspond to "true" and "false," respectively, we obtain: 



fix) A g{y) = sgn{l + f{x) + g{y)} = sgn 



^ _l_ P\ix) _|_ Piiy) 



(1.3) 



q\ix) qiiy) 

Multiplying the last expression in braces by the positive quantity qi(x)^q2iy)^ 
gives 

fix)Ag(y) = sgn{qi(x)^q2iy)^ 

+Piix)qiix)q2iy)^ + P2iy)qiix)^q2iy)} , 

whence deg±(f Ag) ^ 4d. In summary, if / and g can be approximated as in ( |1.2| ) 
by rational functions of degree at most d, then the conjunction / A ^ has threshold 
degree at most 4d. 

It is natural to ask whether there exists a better construction. After all, given a 
sign-representing polynomial p(x, y) for f(x) Ag(y), there is no reason to expect 



that p arises from the sum of two independent rational functions as in ( 1.3 1. Indeed, 
X and y can be tightly coupled inside p(x,y) and can interact in complicated ways. 
Our next result is that, surprisingly, no such interactions can beat the simple con- 
struction above. In other words, the sign-representation based on rational functions 
always achieves the optimal degree, up to a small constant factor. 

Theorem 1.4 (Conjunctions of functions). Let f: X {-1,-1-1} and g: Y ^ 
{—1,-1-1} be given functions, where X,Y c M" are arbitrary finite sets. Assume 
that f and g are not identically false. Let d = deg± if A g). Then there exist 
degree-Ad rational functions 

Piix) P2iy) 



q\ix)' qiiy) 



that satisfy ( |1.2[ ). 



Via repeated applications of Theorem 1.4 we are able to obtain analogous 
results for conjunctions /i A /2 A • • • A /t for any Boolean functions fi,f2,---,fk 
and any k. Our results further extend to compositions F(fi, . . . , fk) for various F 
other than F = AND, such as half spaces and read-once AND/OR/NOT formulas. 



We defer a more detailed description of these extensions to Section 3.4 limiting 
this overview to the following representative special case. 
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Theorem 1.5 (Extension to multiple functions). Let fi,f2,---,fk be noncon- 
stant Boolean functions on finite sets Xi, Xj, ■ ■ ■ , c M", respectively. Let 
F:{-1,+1}* be a half space or a read-once AND/OR/NOT for- 

mula. Assume that F depends on all of its k inputs and that the composition 
F{f\, fi, ■ ■ ■ , fk) has threshold degree d. Then there is a degree-D rational func- 
tion Pi/q-, on Xi, i = 1,2, ... ,k, such that 



Z 

! = 1 



max 

x.gX, 



f{Xi)- 



Pi(Xi) 



qi{Xi) 



< 1, 



where D = %d\og2k. 



Theorem 1.5 is close to optimal. For example, when F = AND, the upper bound 
on D is tight up to a factor of {k log k); for all F in the statement of the theorem. 



it is tight up to a polynomial in k. See Remark 3.22 for details. 

Theorems 1.4 and 1 1.5 [ contribute a strong technique for proving lower bounds 
on the threshold degree, via rational approximation. Prior to this paper, it was 
a substantial challenge to analyze the threshold degree even for compositions of 
the form f A g. Indeed, we are only aware of the work in [i30i i33|, where the 
threshold degree of f Ag was studied for the special case f = g = MAJORITY. The 
main difficulty in those previous works was analyzing the unintuitive interactions 
between / and g . Our results remove this difficulty, even in the general setting of 
compositions F{fi,f2,...,fk) for arbitrary fi, fi, ■ ■ ■ , fk and various combining 
functions F. Specifically, Theorems 1 1 .4| and 1 1 . 5 1 make it possible to study the base 
functions fi, fi, ■ ■ ■ , fk individually, in isolation. Once their rational approxima- 
bility is understood, one immediately obtains lower bounds on the threshold degree 
oiF(A,f2,...,fk). 



1.3 Results for intersections of two halfspaces 



As an application of our direct product theorems in Section 1.2 we obtain the first 
strong lower bounds on the threshold degree of intersections of halfspaces, i.e., 
intersections of functions of the form f(x) = sgn(^ aiX/ — 6) for some reals 
a\, . . . , an,9 . In light of Theorem |1.4[ this task amounts to proving that rational 
functions of low degree cannot approximate a given halfspace. We accomplish this 
in the following theorem, where the notation rdeg^ (/) stands for the least degree 
of a rational function A with ||/ — A||oo ^ e. 

Theorem 1.6 (Approximation of a halfspace). Let f : {-1,+1}"^ {-1,+1} 
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be given by 



/W = sgn(l + 2]^2'^,, j . 



(1.4) 



Then for 1/3 < e < 1, 



rdegJ/) = 0(1 + I 

^^'^^ V log{l/(l-6)}/ 



Furthermore, for all e > 0, 



rdeg,(/) ^ 64Mriog2nl + 1. 



The function (1.4i is known as the canonical half space. Thus, Theorem 1.6 



shows that a rational function of degree <d{n) is necessary and sufficient for ap- 
proximating the canonical halfspace within 1/3. The upper bound in this theorem 
follows readily from classical work by Newman fiV\, and it is the lower bound 
that has required of us technical novelty and effort. The best previous degree lower 
bound for constant-error approximation for any halfspace was Q (log n / log log «), 



obtained implicitly in [33]. We complement Theorem 1.6 with a full solution for 
another common halfspace, the majority function. 



Theorem 1.7 (Approximation of majority). Le?MAJ„: {-1,-|-1}" {- 
denote the majority function. Then 



■1,+1} 



rdeg,(MAJ„) 



log(l/e) 



(logn \ 
1 + I, 
log{l/(l-e)}/ 



2-" < 6 < 1/3, 
1/3 ^ e < 1. 



Again, the upper bound in Theorem 1.7 is relatively straightforward. Indeed, an 
upper bound of O {\og{\ / e}\ogn) for < e < 1/3 was known and used in the 
complexity literature long before our work Il35ll47l l9ll20l. and we only somewhat 
tighten that upper bound and extend it to all e. Our primary contribution in Theo- 
rem |1.7[ then, is a matching lower bound on the degree, which requires consider- 
able effort. The closest previous line of research concerns continuous approxima- 
tion of the sign function on [—1, — e] U [e, 1], which unfortunately gives no insight 
into the discrete case. For example, the lower bound derived by Newman [31] in 
the continuous setting is based on the integration of relevant rational functions with 
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respect to a suitable weight function, which has no meaningful discrete analogue. 
We discuss our solution in greater detail at the end of the introduction. 

Our first application of these lower bounds for rational approximation is to 
construct an intersection of two halfspaces with high threshold degree. In what 
follows, the symbol / A / denotes the conjunction of two independent copies of a 
given function /. 

Theorem 1.8 (Intersection of two halfspaces). Let f: {-1,+1}"^ {-1,+1} 



be given by ( |1.4| ). Then 

deg±(/ A/) = Q(«). 



The lower bound in Theorem 1.8 is tight and matches the construction by 
Beigel et al. [9]. Prior to our work, only an Q(log«/loglog?i) lower bound 
was known on the threshold degree of the intersection of two halfspaces, due to 
O'Donnell and Servedio 1331 . preceded in turn by an a(l) lower bound of Minsky 
and Papert [30|. Note that Theorem 1 1 . 8 1 requires the difficult part of Theorem |1.6[ 



namely, the lower bound for the rational approximation of a halfspace. 



Theorem 1.8 solves an open problem in computational learning theory, due 
to Klivans |[T9]| . In more detail, recall that Boolean functions with low threshold 
degree can be efficiently PAC learned under arbitrary distributions, by expressing 
an unknown function as a perceptron with unknown weights and solving the asso- 
ciated linear program 11211 |20ll . Now, a central challenge in the area is PAC learning 
the intersection of two halfspaces under arbitrary distributions, which remains un- 
resolved despite much effort and solutions to some restrictions of the problem, 
e.g., ll28l l48l l20l l23il . Prior to this paper, it was unknown whether intersections 
of two halfspaces on {0, 1 }" are amenable to learning via perceptron-based tech- 
niques. Specifically, Klivans |[T9l §7] asked for a lower bound of Q (log n) or better 
on the threshold degree of the intersection of two halfspaces. We solve this problem 
with a lower bound of Q(v^), thereby ruling out the use of perceptron-based 
techniques for learning the intersection of two halfspaces in subexponential time. 



To our knowledge. Theorem 1.8 is the first unconditional, structural lower bound 
for PAC learning the intersection of two halfspaces; all previous hardness results 
for the problem were based on complexity-theoretic assumptions iTOl |3l |25l |T8l . 
We complement Theorem |1. 8 1 as follows. 
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Theorem 1.9 (Mixed intersection). Let f : {—I, +1}" {—l,+l} be given 



by{l.4\. Let g: {—1,-1-1}'^^^ — > {—\, +1} be the majority function. Then 

deg±(/A5) = 0(V^). 



7 



In words, even if one of the half spaces in Theorem 1.8 is replaced by a majority 
function, the threshold degree will remain high, resulting in a challenging learning 
problem. Finally, we have: 

Theorem 1.10 (Intersection of two majorities). Consider the majority function 
MAJ„: {-1,+1}" {-1,+1}. Then 

deg±(MAJ„ aMAJ„) = Q(log?i). 



Theorem 1.10 is tight, matching the construction of Beigel et al. 191. It set- 
tles a conjecture of O'Donnell and Servedio ll33l . who gave a lower bound of 
Q(log?i/loglogn) with completely different techniques and conjectured that the 



true answer was Q(log?i). Theorems 1.8 -1.10 are of course also valid for disjunc 



tions rather than conjunctions. Furthermore, Theorems 1.8 and 1.10 remain tight 
with respect to conjunctions of any constant number of functions. 

Finally, we believe that the lower bounds for rational approximation in The- 
orems 1.6 and 1.7 are of independent interest. Rational functions are classical 
objects with various applications in theoretical computer science |[9l [35ll47ll20l [ll. 
and yet our ability to prove strong lower bounds for the rational approximation of 
Boolean functions has seen Uttle progress since the seminal work in 1964 by New- 
man [31]. To illustrate some of the counterintuitive phenomena involved in rational 
approximation, consider the familiar function 0R„ : {0, 1}" — > {— 1, -|-1}, given by 
OR„(a:) = 1 <=> a; = 0. a well-known result of Nisan and Szegedy |f32l| states that 
degi/3(/) = 0(v^), meaning that a polynomial of degree 0(v^) is required for 
approximation within 1/3. At the same time, we claim that rdeg^(/) — 1 for all 
< e < 1 . Indeed, let 



Am{x) = 



1 -MX^/ 



Then ||/ — A^Hoo — > OasM— > oo. This example illustrates that proving lower 
bounds for rational functions can be a difficult and unintuitive task. We hope 



that Theorems 1.6 and 1.7 in this paper will spur further progress on the rational 



approximation of Boolean functions. 



1.4 Our techniques 

We use one set of techniques to obtain our direct product theorems for the threshold 
degree (Sections |1.1| and |1.2| ) and another, unrelated set of techniques to analyze 
the rational approximation of half spaces (Section 1.3 1. We will give a separate 
overview of the technical development in each case. 
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Direct product theorems. In symmetrization, one takes an assumed multivariate 
polynomial p that sign-represents a given symmetric function and converts p into 
a univariate polynomial, which is amenable to direct analysis. No such approach 
works for the function compositions of this paper, whose sign-representing poly- 
nomials can have complicated structure and will not simplify in a meaningful way. 
This leads us to pursue a completely different approach. 

Specifically, our results are based on a thorough study of the linear program- 
ming dual of the sign-representation problems at hand. The challenge in our work 
is to bring out, through the dual representation, analytic properties that will obey 



a direct product theorem. Depending on the context (Theorem |1.2[ or |1.4[ ), 
the property in question can be nonnegativity, correlation, orthogonality, certain 
quotient structure, or a combination of several of these. A strength of this approach 
is that it works with the sign-representation problem itself (over which we have 
considerable control) rather than an assumed sign-representing polynomial (whose 
structure we can no longer control in a meaningful way). We are confident that this 
approach will find other applications. 



As a concrete illustration, we briefly describe the idea behind Theorem 1.4 
The dual object with which we work there is a certain problem of finding, in the 
positive spans of two given matrices, two vectors whose corresponding entries have 
comparable magnitude. By an analytic argument, we are able to prove that this in- 
termediate problem has the sought direct-product property, giving the missing link 
between sign-representation and rational approximation. Thus, by working with 
the dual, we implicitly decompose any sign-representation p{x,y) of the function 
f{x) A g{y) into individual rational approximants for / and g, regardless of how 
tightly the x and y parts are coupled inside p . 



Rational approximation. Our proof of Theorem 1.6 is built around two key ideas. 
The first is a new technique for placing lower bounds on the degree of a given poly- 
nomial p G M[xi, X2, ■ ■ ■ , x„] with prescribed approximate behavior, whereby one 
constructs a degree-nonincreasing linear map M : M.\xi,X2, . . . , x„] — > M.[x] and 
argues that Mp has high degree. This technique is crucial to proving Theorem ] 1.6[ 
which is not amenable to standard techniques such as symmetrization. As applied 
in this work, the technique amounts to constructing random variables Xi , X2, . . . , x„ 
in Euclidean space that, on the one hand, satisfy the linear dependence ^ 2' x, = z 
for a suitably fixed vector z and, on the other hand, in expectation look independent 
to any low-degree polynomial p e M[xi, X2, . . . , x„]. We pass, then, from /? to a 
univariate polynomial by observing that E[p(xi, . . . , x„)] = q(z) for some uni- 
variate polynomial q of degree no greater than the degree of p. This technique is a 
substantial departure from previous methods and shows promise on other problems 
involving approximation by polynomials or rational functions. 
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Second, we are able to prove that the rational approximation of the sign func- 
tion has a self-reducibility property on the discrete domain. More specifically, we 
are able to give an explicit solution to the dual of the rational approximation prob- 
lem by distributing the nodes as in known positive results. What makes this pro- 
gram possible in the first place is our ability to zero out the dual object on the com- 
plementary domain, which is where the above map M: M[;ci, X2, . . ■ , Jc«] — > M[;c] 
plays a crucial role. This dual approach, too, departs entirely from previous analy- 
ses. In particular, recall that Newman's lower-bound analysis is specialized to the 
continuous domain and does not extend to the setting of Theorem |1.7[ let alone 
Theorem 1 1.6 1 

Recent progress 

A recent follow-up paper f45l proves that the intersection of two halfspaces on 
{0, 1}" has threshold degree Q(n), improving on the lower bound of Q(v^) 
in this work. We have also learned that the inequality deg^(F(/, . . . , /)) ^ 
deg^(F) deg±(/) was derived independently by Lee ||29l in a recent work on read- 
once Boolean formulas. 



2 Preliminaries 

Throughout this work, the symbol t refers to a real variable, whereas u, v, w, x, 
y, z refer to vectors in W and in particular in {— 1, -|-1}". We adopt the following 
standard definition of the sign function: 





-1, 


t < 0, 


sgn f = 


0, 


t = 0, 




.1, 


t > 0. 



We will also have occasion to use the following modified sign function: 



sgnt 



-1, t < 0, 
1, t^O. 



Equations and inequalities involving vectors in M" , such as x < y or x ^ 0, are to 
be interpreted component-wise, as usual. 

Throughout this manuscript, we view Boolean functions as mappings / : X — > 
{— 1, -1-1} for some finite set X, where —1 and -|-1 correspond to "true" and "false," 
respectively. If fii, . . . , /xjc aie probability distributions on finite sets Xi, . . . , X^., 
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respectively, then /ii x • • • x fi^ stands for the probability distribution on x 
■ ■ ■ y-Xk given by 



(/ii X • • • X Hk){xi 



k 

,...,Xk) = Y\Mii^i)- 



i = l 



The majority function on n bits, MAJ„ : {— 1,+1}" {— l,+l},is given by 



The symbol Pk stands for the family of all univariate real polynomials of degree up 
to k . The following combinatorial identity is well-known. 

Fact 2.1. For every integer n ^ 1 and every polynomial p g P„-i, 



att = 1 , as explained in |[33l . 

For a real function / on a finite set X, we write ||/||oo = max^ex For 
a subset X C R", we adopt the notation —X = {—x : x g X}. We say that a 
set X C M" is closed under negation if X = —X. Given a function / : X — > M, 
where X C M" is closed under negation, we say that / is odd (respectively, even) 
if f{—x) = — fix) for all G X (respectively, f(—x) = f(x) for all x g X). 

Given functions f : X ^ {—1, +1} and g: Y ^ {—1, +1}, recall that the 
function f A g: X xY ^ {-1, +1} is given by (/ A g)(x, y) = f(x) A g{y). 
The function / v ^ is defined analogously. Observe that in this notation, / A / 
and / are completely different functions, the former having domain X x X and 
the latter X. These conventions extend in the obvious way to any number of 
functions. For example, /i A /2 A • • • A /j. is a Boolean function with domain 
Xi X X2 X ■ ■ ■ X Xji, where X, is the domain of . Generalizing further, we let the 
symbol F{fi, . . . , fi^) denote the Boolean function on Xi x X2 x • • • x X^ obtained 
by composing a given function F : {— 1,-|-1}* — > {—1,-1-1} with the functions 
fi, fi, ■ ■ ■ , fk- Finally, recall that the negated function / : X ^ {—1,-1-1} is given 



MAJ„(x) = 



1, x^i- > 0, 

— 1 , otherwise. 




This fact can be verified by repeated differentiation of the real function 




by fix) = -fix). 
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2.1 Sign-representation and approximation by polynomials 

By the degree of a multivariate polynomial p on M", denoted deg p, we shall 
always mean the total degree of p, i.e., the greatest total degree of any monomial of 
p. The degree of a rational function p{x)/q{x)h the maximum of deg p and deg q . 
Given a function / : X — > {— 1,+1}, where X c M" is a finite set, the threshold 
degree deg±(/) of / is defined as the least degree of a multivariate polynomial 
p such that f(x)p{x) > for all ;c g X. In words, the threshold degree of / 
is the least degree of a polynomial that represents / in sign. Equivalent terms 
in the literature include "strong degree" |6|, "voting polynomial degree" Il26i . 
"polynomial threshold function degree" 1.34.1 . and "sign degree" |[T2l . Crucial to 
understanding the threshold degree is the following result, which is a well-known 
corollary to Gordan's transposition theorem |[T5l . 

Theorem 2.2 (Gordan (Bl). Let X c M" be a finite set, f : X ^ {-1,+1} 
a given function. Then deg±(/) > d if and only if there exists a probability 
distribution /i on X such that 



^fi{x)fix)p{x) = 



for every polynomial p of degree up to d. Equivalently, deg±(/) > d if and only if 
there exists a map y/ : X ^ M, y/ ^ 0, such that f(x)y/{x) ^ on X and 



^ y/(x)pix) = 



xeX 

for every polynomial p of degree up to d. 



Theorem 2.2 has a short proof using hnear programming duality, as explained 
in m §2.2]. 

The threshold degree is closely related to another analytic notion. Let / : X — > 
{—1, +1} be given, for a finite subset X c M". The e-approximate degree of /, 
denoted deg^ (/) , is the least degree of a polynomial p such that \f{x) — p{x)\ ^ e 
for all ;c g X. The relationship between the threshold degree and approximate 
degree is an obvious one: 

deg±(/) = limdeg,(/). (2.1) 

e/' 1 

We will need the following dual characterization of the approximate degree. 
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Theorem 2.3. Fix e ^ 0. Let f : X ^ {-1, +1} be given, X c M" a finite set. 
Then deg^ {f)>d if and only if there exists a function : X — > M such that 



= 1, 



xgX 



and, for every polynomial p of degree up to d, 

X '/'(•^)p(^) = 0. 



xeX 



Theorem 2.3 follows readily from linear programming duality, as explained 



in ll42l §3]. Theorem |2.2| can be derived from Theorem |2. 3 1 in view of ( |2.1| ). 
2.2 Approximation by rational functions 

Consider a function / : X — > {— 1,+1}, where X C R" is an arbitrary set. For 
^ 0, we define 

R{j, d) = mf sup 

p,q xgX 



fix) 



q{x) 



where the infimum is over multivariate polynomials p and q of degree up to d such 
that q does not vanish on X. In words, R{f, d) is the least error in an approximation 
of / by a multivariate rational function of degree up to J. We will also take an 
interest in the related quantity 



R+if,d) = inf sup 

p,q xgX 



p(x) 



q(x) 



where the infimum is over multivariate polynomials p and q of degree up to d such 
that q is positive on X. These two quantities are related in a straightforward way: 



R+(f,2d)^R(fd)^R+(f,d). 



(2.2) 



The second inequality here is trivial. The first follows from the fact that every ratio- 
nal approximant p(x)/q(x) of degree d gives rise to a degree-Id rational approx- 
imant with the same error and a positive denominator, namely, {p(x)q(x)}/q{x)^. 
The infimum in the definitions of R{f, d) and R'^if, d) cannot in general be re- 
placed by a minimum [39|, even when X is a finite subset of M. This is in contrast to 
the more familiar setting of a finite-dimensional normed linear space, where least- 
error approximants are guaranteed to exist. We now recall Newman's classical 
construction of a rational approximant to the sign function lISTl . 
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Theorem 2.4 (Newman). Fix N > I. Then for every integer k ^ I, there is a 
rational function S{t) of degree k such that 



max |sgn?-S(OI ^ \-N~^'^ 

\i^\t\<^N 

and the denominator of S is positive on [—N, — 1] U [1, A/^]. 

Proof (adapted from Newman ll3ll ). Consider the univariate polynomial 



(2.3) 



Pit) = \[[t + N 



(21-1)1 (2k) 



i = \ 



By examining every interval [N'/^'^^\ A/^('+i)/(2*)], where / = 0, 1, 
sees that 



A^l/(2*:) + 1 
Pit) ^ T7T77Tr^b(-0l, 



Letting 



- A/-l/(2«:) 



Ik — I, one 



(2.4) 



5(0 = 



p{t)-p{-t) 

p{t) + p{-ty 



one has ( |2.3| ). T he p ositivity of the denominator of S on [— A'^, — 1] U [1, A^] is a 
consequence of ( 2.4 1. D 

A useful consequence of Newman's theorem is the following general statement 
on decreasing the error in rational approximation. 

Theorem 2.5. Let f: X ^ {-\,+l}be given, where Z c M". Let d be a given 
integer, e = R(f, d). Then for k = 1,2,3,..., 



R(f kd) ^ 1 



Proof. We may assume that e < 1 , the theorem being trivial otherwise. Let 5 be 
the degree-/: rational approximant to the sign function for A'^ = (1 + e)/(l — e), 
as constructed in Theorem 2.4 Let A\, A2, . . . , A„,, ... be a sequence of rational 
functions on X of degree at most d such that sup^j- 1/ — A„, | — > e as m — > 00. The 
theorem follows by considering the sequence of approximants 5'(A„,(x)/{l — e}) 
as m — > 00. D 
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2.3 Symmetrization 

Let S„ denote the symmetric group on n elements. For a g S„ and x g M", 
we denote ax = (x^ii), . . . , Xa{n)) g M". The following is a generalized form of 
Minsky and Papert's symmetrization argument ll30ll . as formulated in 



Proposition 2.6 (cf. Minsky and Papert). Let ni, . . . ,nk be positive integers. 
Let : {0, 1}"' X • • • X {0, 1}"* ^be a polynomial of degree d. Then there is a 
polynomial p on M.^ of degree at most d such that for all x in the domain of (j), 

E [4>[aiXi, (TkXk)] = /?(.. . H \-Xi^„., . . .). 

aieS,n,...,iTkeS„i^ 

We now obtain a form of the symmetrization argument for rational approxima- 
tion. 

Proposition 2.7. Letni, . . . ,nkbe positive integers, and a, P distinct reals. Let 
G : {a, yS}"' X • • • X {a, yS}"* — > {—1, +1} be a function such that G{xi, . . . , X]^) = 
G{a\X\, . . . , GkXji) for all a\ g 5„|, . . . , cr^, g 5,,^. Let d be a given integer Then 
for each e > R^{G,d), there exists a rational function p/q on M*^ of degree at 
most d such that for all x in the domain ofG, one has 



p(. . . 1 H \-Xi^„., ...) 

G{x) - 



< e 



. . , Xi i -|- • • • -|- Xi^nj, . . 

andq(. . . ,Xij + • • • +x;,„,, . . .) > 0. 

Proof. Clearly, we may assume that e < 1. Using the linear bijection (a, P) 
(0, 1) if necessary, we may further assume that a = and p = \. Since e > 
R^{G, d), there are polynomials P, Q of degree up to d such that for all x in the 
domain of G, one has Qix) > and 

(1 - e)Q(x) < Gix)Pix) < (1 + e)Qix). 

By Proposition 2.6 there exist polynomials p,q on of degree at most d such 
that 



E [P{aixi, akXk)] = p{-- . H l-x/,„,, . . .) 



and 



E ■ ■ ■ , (^kXk)] =<?(•• ■,Xij H l-x,>,, . . .) 

for all X in the domain of G. Then the required properties of p and q follow 
immediately from the corresponding properties of P and Q. D 
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3 Direct product theorems 



In the several subsections that follow, we prove our direct product theorems for 
polynomial representations of composed Boolean functions. General compositions 



are treated in Section 3. 1 followed by a study of conjunctions and other specific 



compositions in Sections 3.2-3.5 



3.1 General compositions 

We begin our study with general compositions of the form F(fi, . . . , f^). Our 
focus in this section will be on results that depend only on the threshold or approxi- 
mate degrees of F , fy, . . . , f^.ln later sections, we will exploit additional structure 
of the functions involved. The following result settles Theorems ] 1.1 [ and 1.2 from 
the Introduction. 



Theorem 3.1. Let f: X {-\,+\} and F: {-1,+1}* ^ {-1,+1} 
functions, where X c M" is a finite set. Then for Q < e < \, 



In particular, 



deg,(F(/,...,/))^deg,(F)deg±(/). 



deg±(F(/, ...,/)) ^ deg±(F) deg±(/). 



' given 



(3.1) 



(3.2) 



Proof. Recall that the threshold degree is a limiting case of the approximate degree. 



as given by (2.1 1. Hence, one obtains (3.2i by letting e 1 in (3.1 1. In the 



remainder of the proof, we focus on ( 3. 1 1 alone. 

Put D = degf(F) and d = deg±(/). By Theorem 2.3 there exists a map 
"V : {-1,+1}*^ M such that 



T(z)F(z) > 6, 



Z 

Z£{-1, + 1}* 



(3.3) 
(3.4) 



and ^^{z)p{z) = for every polynomial p of degree less than D. By Theo- 
rem |2^ there exists a distribution fi on X such that M(^)f (^)p(^) = for 
every polynomial p of degree less than d. 
Now, define 4" : X*^ ^ R by 



a...,xi,...) = 2'W(...,fixi),...)llfi{xd. 



i=l 



16 



We claim that 

Y,C(---,Xi,...)p(...,xi,...) = (3.5) 

for every polynomial p of degree less than Dd. By linearity, it suffices to consider 
a polynomial p of the form p{. . . , xi, . . .) = ]^ p,- (;c,), where ^ deg < Dd. 
Since *P is orthogonal on {—1, +1}* to all polynomials of degree less than D, we 
have the representation 



T(z)= '^(^)Ilz' 



for some reals T (S) . As a result, 
^Ci- ■ ■ • ■■)?(■ .■,Xi,...) 

X* 

= 2' T(5) n [ Y I n ( Z /"(^'O/^' C^') 1 • (3-6) 

^ . ' 

Since ^ deg p, < Dd, the pigeonhole principle implies that deg pi < d for more 
than k — D indices / g {1, . . . , fc}. As a result, for each set S in the outer summation 
of (3.6 1, at least one of the underbraced factors vanishes (recall that / is orthogonal 
on X with respect to n to all polynomials of degree less than d). This gives (3.5 1. 

We may assume that / is not a constant function, the theorem being trivial 
otherwise. It follows that deg±(/) ^ 1 and Xx /"(■^)/(-^) — ^- Now, define a 
product distribution /I on X*^ by !(..., .x, , ... ) = /i(x, ). Since fi(x)f(x) = 
0, it follows that the string (. . . , / (;c,), . . . ) is distributed uniformly on {— 1, +1}*^ 
when (. . . , Xi, . . .) X. As a result, 

Y\Ci...,Xi,...)\ = 2' E [\^(...,z,,...)\] = l, (3.7) 

where the last equality holds by ( |3.3| ). Similarly, 
Yci---,xi,...)F(...,fixi),...) 

X* 

= 2*^ E [«?(..., z,,... )F(..., z,,... )]> e, (3.8) 



1* 



where the inequality holds by (3.4i. Now (3.1 1 follows from (3.5 1, (3.7 1, (3.8 1, and 
Theorem |2. 3 1 D 
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Remark. In Theorem 3. 1 and elsewhere in this paper, we consider Boolean func- 
tions on finite subsets of M" , which is the setting of primary interest in compu- 
tational complexity. It is useful to keep in mind, however, that approximation 
and sign-representation problems on compact infinite sets and other well-behaved 
infinite sets are easily reduced to the finite case. 

We now consider the so-called AND-OR tree, given hy f(x) = V"=i Ay=i • 
We improve the standing lower bound on the approximate degree of / from 
Q (^^0.66...^ Q (^^0.75^^ the best upper bound being Oin). 



Theorem 



1.3 



(RESTATED). Let f : {-1,-|-1}"^^ {-\, +1} be given by f (x) = 



V"=i A"=i^/7- Then 



degi/3(/) = Q(«°-^5). 



Proof. Without loss of generality, assume that n = 4m ^ for some integer m. Define 
g: {-1,+ir' ^ {-l,+l}by 

m 4m ^ 

g(x) = y /\xij. 
i=i j=i 

Let G : {-1, -t-l}"^™ {-1, -1-1} be given by G(x) = xi v • • • v X4,„. A well- 
known result of Minsky and Papert |30| states that deg±{g) = m. Also, Nisan and 
proved that degw3(G) = &{^/m). Since / = G(g, . . . , g), it follows 



Szegedy 
by Theorem 



3.1 



that degi/3(/) = Q.(m.y/m), as desired. D 



We now further develop the ideas of Theorem 3.1 to obtain a more general 
result on the approximation of composed functions by polynomials. This gen- 
eralization is based on a combinatorial property of Boolean functions known as 
certificate complexity. For a string x g {—1, -1-1}'^ and a set 5 C {\,2, . . . ,k} 
whose distinct elements are ii < i2 < ■ ■ ■ < i\s\, we adopt the notation x\s = 
(xi^,Xi^, . . . , x/|j|) G {0, iP'. For a Boolean function F : {—1, -|-1}* {—1, -|-1} 
and a point x g {— 1,-|-1}'^, the certificate complexity of F at x, denoted CxiF), 
is the minimum size of a subset 5 C {1,2, . . . , k} such that F(x) = F(y) for all 
y G {—1,-1-1}'^ with x\s = y\s- The certificate complexity of F, denoted C(F), is 
the maximum Cc (F) over all x . In the degenerate case when F is constant, we have 
C(F) = 0. At the other extreme, the parity function F : {-1,-|-1}* {-1,-1-1} 
satisfies C(F) = k, which is the maximum possible. The following proposition is 
immediate from the definition of certificate complexity. 
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Proposition 3.2. Let F : {-1, {-1, +1} be a given Boolean function. 

Let y & { — be a random string whose ith bit is set to —I with probability 

a, and to +1 otherwise, independently for each i . Then for every x g { — 1, +1}*, 

CAF) 

P[F(xi,...,xt) = F(xiyi,...,Xkyk)]^ min 77(1 -«/,). 

y 'i<'2<---<'Cv(f) , 

Proof. Fix a set 5 C {1, 2, . . . , ^} of cardinality Cf(F) such that F(x) = F(y) 
whenever ;c 1 5 = jl^. Then clearly P,,[F(. .., ;c,-, ... ) = Fi. . . , xtyi, . . .)] ^ 
P,,[3;|^ = (1,1,..., 1)], and the bound follows. D 



We can now state and prove the desired generalization of Theorem 3.1 



Theorem 3.3. Let f: X ^ {-I, +1} and F : {-1,+1}* {-\, +1} be given 
functions, where X c M" is a finite set. Then for each e,8 > 0, 

deg^^^_2+2(,-S)Cin (Fif,..., /)) ^ deg, (F) deg,_,(f) (3.9) 

for some rj = T}{e, F) > 0. 



Remark 3.4. One recovers Theorem [371] by letting (5 \ in ( |3.9| ). We also 
note that (3.9l is considerably stronger than Theorem 3.1 functions {—1, +1}*^ — > 
{— 1,+1} are known, such as ODD-MAX-BiT I8|, with threshold degree 1 and 
(1 — ^)-approximate degree for 8 as small as ^ = exp{— ^'^^'^j. Another 
advantage of Theorem 3.3 is that the (1 — ^)-approximate degree is easier to bound 
from below than the threshold degree |l8l|49l|24l|36l[37l, even for S exponentially 
small. For 8 small, the (1 — (5)-approximate degree is essentially equivalent to a 
notion known as perceptron weight 1.30. .81 l49l IT71 l20l l22l l24l Il2l [361 137)1 . 



Proof of Theorem 3.3 Let D = deg^(F) and d = degj_^(/) > 0. Theorem 2.3 
provides a map *F: {— 1,+1}'^ — > M such that 

Z£{-1, + 1)* 

X ^{z)F(z)> e + ,1 (3.11) 

for some t] = fj{e, F) > 0, and X-e{-i ^i^)piz) = for every polynomial p 
of degree less than D. Analogously, there exists a map ^/ : X — > M such that 

J^\y,{x)\ = l, (3.12) 
Y,wix)f(x)> 1-8, (3.13) 
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and Xigx w(x)p(x) = for every polynomial p of degree less than d. 
Define f : ^ M by 



ci...,xi,...)= 2' . . , sgii i//(x,), . . .)ni^^(^')i- 



(=1 



By the same argument as in Theorem |3.1[ we have 

^Ci---,^i,---)p(---,Xi,...) = 



(3.14) 



for every polynomial p of degree less than Dd. 

Let // be the distribution on X* given by fj.{. . . ,Xi, . . .) = Y[\w(xi)\- Since 
I// is orthogonal to the constant polynomial 1, the string (. . . , sgh y/(xi), . . .) is 
distributed uniformly over {—1, +1}*^ when one samples (. . . , x, , . . . ) according 
to ^ . As a result, 

^\C{...,Xi,...)\= l'Pfe)l = l' (3-15) 

X* ze{-l,+l}* 

where the final equality uses ( |3.10| ). 

Define A+i = {x ^ X : y/(x) > 0, f(x) = —1} and A_i = {x e X : y/(x) < 
0, f(x) = +1}. Since >// is orthogonal to the constant polynomial 1, it follows 
from ( [3TT2I ) that 



.t:(//(j:)<0 



.ir:^(A)>0 



In light of ( |3l3| , we see that X.ea+JV'C^)! < '5/2 and X..EA_,k(^)l < 3/2. 
Now, for any given z g {— 1,+1}*, the following two random variables are identi- 
cally distributed: 

• the string (. . . , /(x,), . . . ) when one chooses (. . . , x, , . . . ) ~ ju and condi- 
tions on the event that (. . . , s'gh y/(xi), . . . ) = z; 

• the string (. . . , , . . . ), where y g {—1, -|-1}* is a random string whose 
ith bit independently takes on —1 with probability 2 XateA- I V'C-^)! < ^• 



Proposition 3.2 now implies that for each z e {— 1, -1-1} , 



ErF(...,/(x,),...) I (...,sghv/(x,),...) = 



F(..., sgn y/{xi), . . .) 



^ 2 - 2(1 - (5) 



C(f) 



(3.16) 
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We are now prepared to complete the proof. We have 



= 2' E [^(. . . , sgii ^fc), ■ ■ ■ )F{. . . , f{xi), . . . )] 
> e + ;7-2 + 2(l-^)^(^), 



(3.17) 



whe re th e last two inequalities u se (|3.16|), (|3.10| ), and (|3.11[ ). In vi ew o f Theo 
rem 



2.3 the exhibited properties (3.14 1, (3.15 1, and (3.17 1 of f force (3.9 1. 



D 



Theorems 3.1 and |3 . 3 1 complement known upper bounds for the approximation 
of composed functions. The following theorem is due to Buhrman et al. ifTTl . who 
studied the approximation of Boolean functions with perturbed inputs. We include 
the proof from [1 IJ and slightly generalize it to any given parameters. 

Theorem 3.5 (cf. Buhrman et al.). Fixfunctions F : {-1, +1}*^ {-1, +1} and 
f: X ^ {-1, +1}, where X c R" is finite. Thenforall A,d ^ 0, 



deg 



ri{A,S) 



(F(/,...,/))^deg^(F)deg,(/), 



where 



rj(A,d) = A + 2-2(^\-j^^ 

In particular, 

degi/3(F(/,...,/)) 



C{F) 



(3.18) 



(3.19) 



(3.20) 



^ degi/3(F)degi/3(/) . 0(log{l +degi/3(F)}). 

Proof (adapted from Buhrman et al.). Fix polynomials P and p on {—1, +1}*^ and 
X, respectively. As usual, P may be assumed to be multilinear in view of its 
domain. Define O : ^ M by 

1 



^{...,Xi,...) = P 



(■T 



+ \\f-p\ 



Fix any input (. . . , xi, . . .) g X'' and consider a random variable y g {— 1,+1}* 
whose ith bit takes on — 1 with probability 

^ ^ 1 fiXl)pixi) ^ Il/-Plloo 

2 2(l + ||/-;j||oo) " l + ||/-7?||oo' 
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independently for each /. Then 
\^(...,xi,...)-Fi...,f{xd,...)\ 

= E[P(...,yJixi),...)-F{...,f{xi),...)] 

y 

^ IIP - ^'lloo + E[f (. . . , yJixi), ...)-F{..., fixd, ...)] 



^ ||P-F||oo + 2 



3 A ll/-;^lloo \ 
V i + \\f-p\\oc) 



C(F) 



+ \\f-p\ 

where the first and last steps in the derivation follow by the multilinearity of P and 
by Proposition |3 .21 respectively. This completes the proof of (|3.18|). 



Taking A = 1/6 and (5 = 1/(12C(F)) in ( [XT8| ) gives 

degi/3(P(/, ■-.,/)) ^ degi/6(f )degl/(l2C(F))(/)• 
Basic approximation theory [141 shows that for each e > 0, there exists a univariate 
polynomial of degree C?(log j) that sends [— |, — |] — > [— 1 — e, — 1 + e] and 
[|, |] — > [1 — e, 1 + e]. As a result, we obtain 

degi/3(F(/, ...,/)) ^ degi/3(F)degi/3(/) • 0(log{l + C(F)}), 



which is equivalent to (3.20 1 because C{F) is known to be within a polynomial of 



degi/3(F) for every Boolean function F : {—1, +1} {—1, +1}, as discussed in 
detail in the survey article UJl. D 



Compositions with k distinct functions. We now consider compositions of the 
form F{fi,...,fk), where the functions fi, ■ . . , fk may all be distinct. For a func- 



tion F: {-1, -l-lj 



and a vector v = (vi, . . . , Vk) of nonnegative integers, 



define the (e, v)-approximate degree deg^ „ {F) to be the least D for which there is 
a polynomial P{x\, . . . ,Xk) with 

P G span : S Q{1,2, . . .,k}, ^ D 

and ||F — P||oo ^ Note that the e-approximate degree of F is the {e,v)- 
approximate degree of F for o = (1, 1, . . . , 1). It is clear that 

deg,,„(P) ^ . . min {o,-, + o,-^ + • • • + vi }, 

/l<(2<---</dcgj(f) 

with an arbitrary gap achievable between the right and left members of the in- 



equality. We will also need the following generalized version of Theorem 2.3 due 
to loffe and Tikhomirov [171. 
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Theorem 3.6 (loffe and Tikhomirov). Let X be a finite set. Fix any family O of 
functions X — > M and an additional function / : X — > M. Then 



min 11/ - (?!)||oo = max 

</>Gspan(<I)) y/ 



'^f{x)ii/{x) 



where the maximum is over all functions (// : X — > M such that 



and, for each ^ & 



xeX 



A short proof of Theorem 3.6 can be found, e.g., in ll42l §3]. With this setup in 
place, we obtain the following analogues of Theorems 3.3 and 3.5 for compositions 
oftheformF(/i,...,/<:). 

Theorem 3.7. Fix nonconstant functions F: { — 1,+!}'^ — > { — 1, +1} and 
/: Xi — > {— 1,+1}, / = 1,2, . . . , k, where each Xj c M" is finite. Then for 
e,8 > 0, one has 



deg,+^_2+2(i-^)C(f)(f (/i, ..■,fk))^ deg, „(F) 
for some rj = f}{e, F) > 0, where v = (degi_^(/i), . . . , degi_^(/t)). 



(3.21) 



Proof. Let D = deg^„(F) and d, = deg)_^(/). Theorem 3.6 provides a map 
: {-1,+1}*^ M such that 



ze{-l,+l}* 

ZG{-1, + 1}* 

for some ri = fj{e, F) > 0, and 

^(z)= ^>F(5)f]^,• 



(3.22) 
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for some reals '^(S), where y = {S Q {\,2, . . . ,k} : Xiss^i > D}. Analo- 
gously, there are maps y/i : X, — > M, / = 1, 2, . . . , k, such that 

XI Wi(Xi)fi(Xi) > l-S, 

and Xt,GX, ¥i(^i)p(xi) = for every polynomial p of degree less than df. 
Define {■ : Xi x • • • x X^. ^ M by 

Ci...,Xi,...) = 2^ "¥(..., sgn y/iixi), )Y[\ Wiixi)\. 

(=1 



By an argument analogous to that in Theorem 3. 1 we have 

X C(...,Xi,...)p(...,Xi,...) = 

Xi x - xXii 



(3.23) 



for every polynomial p of degree less than D. 

Let ^ be the distribution on Xi x • • • x X^. given by ju(. . . , xi, . . .) = 
n Since each y/j is orthogonal to the constant polynomial 1, the string 

(. . . , s'gn y/iixi), . . . ) is distributed uniformly over {—1, +1}*^ when one samples 
(. . . , X, , . . . ) according to . As a result, 



2] \c(...,xi,...)\= i^(z)i = i' 



(3.24) 



where the final equality uses (3.22i. 



By an argument analogous to that in Theorem 3.3 we obtain 

C(---,Xi...)F(...,Mxi),...)> e + n-2 + 2il-df^^\ (3.25) 

X[ x - xXk 



In view of Theorem 2.3 the exhibited properties (3.23 1, (3.24i, and (3.25 1 of 
complete the proof. D 



Remark 3.8. Analogous to the earlier development, taking ^ \j in Theorem 3.7 



yields the lower bound deg^(F(/i, . . . , /t)) ^ deg^ „(F) for each e > 0, where 
V = (deg±(/i), . . . , deg±(/t)). 
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Theorem 3.9. Fix functions F : {-1,+1}^' {-1,+1} and /, : X, 

{— 1, +1}, / = 1,2,...,/:, where each Xi c M" is finite. Then for all A , d ^ 0, 

^^S^(A,s)(F(fu ...,fk))^ degA,„(f ), 
where v = (deg^(/i), . . . , deg^C/j,)) and 

rj(A,S) = A+2-2h-—j . (3.26) 

In particular, 

degi/3(f (/i, ...,/,)) = degi/3,„(F) • 0(log{l + degi/3(F)}) (3.27) 
forv = (degi/3(/i), . . . , deg^^^ifk)). 

Proof. Fix a real polynomial P on {—1, +1}*^ and polynomials on Xj, respec- 
tively. As usual, P may be assumed to be multilinear in view of its domain. Define 
<D : Zi X • • • X ^ M by 



The remainder of the proof is analogous to that of Theorem 3.5 with the obvious 
notational changes and an optimal choice of approximants P, pi, . . . , pk. D 



Bounds using block sensitivity. Several results above can be sharpened some- 
what using the notion of block sensitivity, denoted bs(F) for a function 
F : {— 1, -t-l}*^ — > {—1, -1-1} and defined as the maximum number of nonempty dis- 
joint subsets ^i, ^2, ^3, • • • C {1, 2, . . . , ^} such that on some input :ic g {— 1, -|-1}*^, 
flipping the bits in any one set 5, changes the value of the function. We have: 

Proposition 3.10. Let F: {-1,-|-1}*^ {-l,+\}be a given Boolean function. 
Let y ^ {—I , +1}'^ be a random string whose ith bit is set to —I with probability 
at most a, independently for each i. Then for every x g {—1, -|-1}^, 

P[F(xi, ...,Xk)j^ F(xiyi, . . .,Xkyk)] ^ 2abs(F). 

y 

Proof. By monotonicity, we may assume that each bit of y takes on — 1 with proba- 
bility exactly a. For a fixed integer r and a uniformly random string y g {— 1,-|-1}*^ 
with |{/ : yi = — 1}| = r, the probability that F(. . . , x,-, . . . ) ^ F(. . . , Xiyi, . . . ) 
is clearly at most bs(F)/lk/r] ^ 2rbs(F)/ k. Averaging over r gives the sought 
bound. D 
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Since by definition C{F) ^ bs(F) for every function F: — > 
{—1, +1}, use of Proposition |3. 10| instead of Proposition |3.2| can lead to sharper 



bounds in some results of this section. Specifically, Theorems 3.3 3.5[ 3.7[ and 3.9 

,/)) ^deg,(F)degi_,(/); 



remain valid with ( 3.9 1 replaced by 

deg,+,,_4^bs(f)(^(/' 



with ( |3.19[ ) and ( |3.26[ ) replaced by 



A + 



4dhs(F) 



and with ( |3.21[ ) replaced by 

-4<5bs(f) 

(F(/i,...,/,))^deg,„(F). 



(3.28) 



(3.29) 



(3.30) 



In particular, we obtain from Theorem 3.3 that 

degi/3(F(/, ...,/)) ^ deg2/3(F)degi_(i2bs(f ))-'(/) 

^degi/3(^')deg^/3(/).Q(^-pi-^). 

3.2 Auxiliary results on rational approximation 

In this section, we prove a number of auxiliary facts about uniform approximation 
and sign-representation. This preparatory work will set the stage for our analysis of 
conjunctions of functions. We start by spelling out the exact relationship between 
the rational approximation and sign-representation of a Boolean function. 

Theorem 3.11. Let f: X {-1, -1-1} a given function, where X c M" is 
finite. Then for every integer d. 



deg±(/) ^ d 



R+if,d) < 1. 



Proof For the forward implication, let p be. a. polynomial of degree at most d 
such that f(x)p(x) > for every x & X. Letting M = max^ex l/^C-^)! and m = 
min.tGZ l7'(-'c)|, we have 



R^if,d) ^ max 



fix) 



p(x) 



M 



m 

^ 1 < 1. 

M 



For the converse, fix a degree-J rational function p(x)/q(x) with q(x) > on 
X and maxxl fix) — {p(x)/q{x)}\ < 1. Then clearly /(x);?(x) > on X. D 
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Our next observation amounts to reformulating the rational approximation of 
Boolean functions in a way that is more analytically pleasing. 



Theorem 3.12. Let f: X {-1, +1} be a given function, where X c M" is 
finite. Then for every integer d ^ deg±(/), one has 

R+(f,d) = M 

where the infimum is over all c ^ 1/or which there exist polynomials p, q of degree 
up to d such thatO < ^q(x) ^ f(x)p(x) ^ cq{x) on X. 

Proof. In view of Theorem |3.1 1[ the quantity R^{f, d) is the infimum over all 
e < 1 for which there exist polynomials p and q of degree up to d such that 
< (1 — e)q{x) ^ f(x)p(x) ^ (1 + e)q(x) on X. Equivalently, one may require 
that 

< ! ^— q(x) ^ f{x)p{x) ^ ^ q(x)- 

\l\ — \l\ — e'- 

Letting c = c(e) = ^/(^^f~e)J(T^^Vj, the theorem follows. D 

We will now show that if a degree-J rational approximant achieves error e 
in approximating a given Boolean function, then a degree-2<i approximant can 



achieve error as small as e^. Note that this result is a refinement of Theorem |2.5 
for small k. 

Theorem 3.13. Let f: X ^ {-1, +1} be a given function, where X C M". Let 
d be a given integer Then 

where e = R(f,d). 

Proof. The theorem is clearly true for e = 1. For ^ e < 1, consider the 
univariate rational function 



S{t) 



4vT 



2 



Calculus shows that 



max 

l-e^|rKl+e 



\sgnt-S(t)\ = (- ^==) . 

Vl + Vl-eV 
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Fix a sequence Aj, A2, . . . of rational functions of degree at most d such that 
^^Pagx ~ ^m{x)\ e d& m ^ 00. Then S{Ai{x)), S{A2{x)), ... is the 

sought sequence of approximants to /, each a rational function of degree at most 
2d with a positive denominator. D 

Corollary 3.14. Let f: X ^ {-1, +1} be a given function, where X C R". 
Then for all integers d ^ \ and reals f ^ 2, 

R+{f,td) ^ R{f,d)"\ 

Pro of. If t = 2'" for some integer k ^ \, then repeated applications of Theo- 
rem 



3.13 



yield R+{f, Td) ^ R{f, I'-'dy ^ • • • ^ R{f, dy . The general case 

follows because 2^^°^'^ ^ 1 12. D 



3.3 Conjunctions of functions 



In this section, we prove our direct product theorems for conjunctions of Boolean 
functions. Recall that a key challenge will be, given a sign-representation (p{x,y^ 
of a composite function f(x) A ^(j), to suitably break down (p and recover in- 
dividual rational approximants of / and g. We now present an ingredient of our 
solution, namely, a certain fact about pairs of matrices based on Farkas' Lemma. 
For the time being, we will formulate this fact in a clean and abstract way. 

Theorem 3.15. Fix matrices A,B e ]^mx« ^ ^^^^ c ^ 1. Consider the 
following system of linear inequalities in u,v e M": 



1 



Am ^Bo ^ cAu, 

u ^ 0, 
v>0. 



(3.31) 



If u = V = is the only solution to (3.31 1, then there exist vectors to ^ and 
z ^ such that 

w^A + z^B > ciz^A + w^B). 



Proof. If M = w = is the only solution to ( 3.3 1 1, then linear programming duality 
implies the existence of vectors w ^ and z ^ such that w^A > cz^A and 
z^ B > cw^ B. Adding the last two inequalities completes the proof. D 

For clarity of exposition, we first prove the main result of this section for the 
case of two Boolean functions at least one of which is odd. While this case seems 
restricted, we will see that it captures the full complexity of the problem. 
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Theorem 3.16. Let f: X {-1,+1} and g:Y^ {-1,+1} be given func- 
tions, where X,Y d M" are arbitrary finite sets. Assume that / ^ 1 and g ^ I. 
Let d = deg±(/ A g). If f is odd, then 

R+(f,2d) + R+{g,d) < 1. 

Proof. We first collect some basic observations. Since / # 1 and ^ ^ 1, we have 
deg±(/) ^ d and deg±(g) ^ d. Therefore, Theorem |3. 1 1 [ implies that 



R+if,d)<l, R+(g,d)<h 



(3.32) 



In particular, the theorem holds if R'^ig, d) = 0. In the remainder of the proof, we 
assume that R'^ig, d) = e, where < e < 1. 

By hypothesis, there exists a degree-t/ polynomial (p such that f(x) A g(y) = 
sgn^(;v;, y) for allx ^ X, y & Y. Define 

X- = {X G X : fix) = -1}. 

Since X is closed under negation and / is odd, we have f(x) = 1 <=> —x g X~. 
We will make several uses of this fact in what follows, without further mention. 
Put 



c = 



where S g (0, 1) is sufficiently small. Since R'^(g,d) > (c^ — l)/(c^ + 1), we 
know by Theorem 3.12 that there cannot exist polynomials p, q of degree up to d 
such that 

< -q(y) ^ g{y)p(y) ^ cq(y), y^Y. (3.33) 



We claim, then, that there cannot exist reals a^ ^ 0, x g X, not all zero, such that 

1 

c 



^ a_^cf)i-x, y) ^ g(y) ^ a^cpix, y) ^ c ^ a_,0(-x, y), y & Y. 



xeX- 



xeX- 



xeX- 



Indeed, if such reals a^ were to exist, then ( |3.33| ) would hold for the polynomi- 
als p(y) = Y.xsx-'^'c4>{x,y) a nd q{ y) = Y^xsx- <^-x4'i-^,y)- In view of the 
nonexistence of the a^. Theorem 3.15 appUes to the matrices 

,y)\ , \gi.y)(pix,y) 



ysY,xsX- 



ysY,xsX- 
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and guarantees the existence of nonnegative reals Xy, Hy for j g 7 such that 

^ Xy(l){-x, y) + X fiyg{y)(t>{x, y) 



Define polynomials a, /? on X by 

«(-^)= X ^^y4>i-x,y) - i^y4>{x-y)}^ 

ve^-'(-l) 

A(-^)= X {^v<?!'(--^,3') + /^>'?!'(-^,3')}- 

ve,?-'(l) 



X" 



(3.34) 



Then ( |3.34| ) can be restated as 

a{x) + pix) > c{—a(—x) + fS(—x)}, 



Both members of this inequality are nonnegative, and thus {a{x) + fiix)}^ > 
c^{—a{—x) + P{—x)}^ for X g X~ . Since in addition a(— x) ^ and P{—x) ^ 
for X G , we have 



{a(x) + p{x)f > c2{a(-x) + yg(-x)}2, 
Letting y (x) = {a(x) + p{x)]^, we see that 

+ 1 7 (-x) - 7 (x) 



X G X" 



R+if,2d) ^ max 

xeX 



fix) 



y (-x) + y (x) 

where the final inequality holds for all (5 g (0, 1) small enough. 



< 1 



D 



Remark. In Theorem 3.16 and elsewhere in this paper, the degree of a multi- 
variate polynomial p{xi, X2, . . . , x„) is defined as the greatest total degree of any 
monomial of A related notion is the partial degree of p, which is the maximum 
degree of p in any one of the variables xi, X2, . . . , x„. One readily sees that the 
proof of Theorem 3.16 applies unchanged to this alternate notion. Specifically, if 
the conjunction /(x) A g{y) can be sign-represented by a polynomial of partial 
degree d, then there exist rational functions F(x) and G(y) of partial degree 2d 



such that II / — II oo + II ^ — G II 00 < 1 . In the same way, the program of Section 3.4 



carries over, with cosmetic changes, to the notion of partial degree. Analogously, 
our proofs apply to hybrid definitions of degree, such as partial degree over blocks 
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of variables. Other, more abstract notions of degree can also be handled. In the 
remainder of the paper, we will maintain our focus on total degree and will not 
elaborate further on its generalizations. 



As promised, we will now remove the assumption, made in Theorem 3.16 



about one of the functions being odd. The result that we are about to prove settles 
Theorem 11.41 from the Introduction. 



Theorem 3.17. Let f: X {-I, +1} and g : Y [-1, +1} be given func- 
tions, where X,Y a M" are arbitrary finite sets. Assume that / ^ 1 and g ^ I. 
Let d = deg±(/ A g). Then 



and, by symmetry, 



R+{f,Ad) + R+{g,2d) < 1 



R+{f,2d) + R+{gAd) < 1. 



(3.35) 



Proof. It suffices to prove ( |3.35| ). Define X' c M"+^ by X' = {{x, 1), (-x, -1) : 
X G X}. It is clear that X' is closed under negation. Let /' : X' ^ {—1, +1} be the 
odd Boolean function given by 



f(x,b) = 



fix), b=l, 
-f(-x), b = -l. 



Let be a polynomial of degree no greater than d such that fix)Ag{y) = 
sgn0(x, y). Fix an input x & X such that f(x) = —1. Then f'(x, b) A g{y) = 
sgn{A'(l + b)(p(x, y) + 4'i—x, y)<p{x, y)] for a large enough constant K ^ I, 
whence 

deg±(/'A^)^2^/. 



Theorem |3.16| now yields R+(f',4d) + R+{g,2d) < 1. Since R+{f,4d) ^ 
R^ (/', 4d) by definition, the proof is complete. D 



Finally, we obtain an analogue of Theorem 3.17 for a conjunction of three and 
more functions. 



Theorem 3.18. Let f\, fi, ■ ■ ■ , fk be given Boolean functions on finite sets 
X\, X2, . . . , Xi( c M", respectively. Assume that f ^ 1 for i = 1,2, ... ,k. 
Let d = deg±(/i A /2 A • • • A Z^.)- Then 

k 

2; /?+(/, , D) < 1 

(=1 

forD = Sdlog2k. 
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Proof. Since /i, /2, • • • , /jt # 1, it follows that for each pair of indices / < 7, the 



function /, A is a subfunction of /i A /2 A • • • A /i-. Theorem 3.17 now shows 
that for each / < j, 

R+(fi,4d) + R+{fj,4d) <l. (3.36) 
Without loss of generality, 4<i) = max/=i^ R'^ifi, 4d). Abbreviate e = 



R+ifi,4d).By ( |3.36| ), 

/?+(/,-, 4 J) < min 



1 -e, 



1 



I — 2, 3, . . . , 



Now Corollary |3 . 1 4| implies that 

k 



1=1 



!=2 



3.4 Other combining functions 



As we will now see, the development in Section 3.3 applies to many combining 
functions other than conjunctions. Disjunctions are an illustrative starting point. 
Consider two Boolean functions / : X — > {— l,+l}andg: Y — > {— 1, +1}, where 
X,Y c M" are finite sets and f,g ^ —1. Let d = deg±(/ v g). Then, we claim 
that 

R+(f,4d) + R+(g,4d) <L (3.37) 

To see this, note first that the functi on / v g has the same threshold degree as its 
negation, f /\g. Applying Theorem 3.17 to the latter function shows that 



R+(f,4d) + R+(g,4d) < 1. 



This is equivalent to (3.37 1 since approximating a function is the same as approxi- 
mating its negation: R+(f, 4d) = R+{f, 4d) and /?+(g, 4d) = R+{g, 4d). As in 



the case of conjunctions, (3.37 1 can be strengthened to 



R+(f,2d) + R^{g,2d) < 1 

if at least one of /, g is known to be odd. These observations carry over to disjunc- 
tions of multiple functions, fi y fi^ ■ ■ ■ ^ fk- 

The above discussion is still too specialized. In what follows, we consider 
composite functions h(fi, fi, ■ ■ ■ , fk), where h : {—1, -|-1}* {—1, -1-1} is any 
given Boolean function. We will shortly see that the results of the previous sections 
hold for various h other than h = AND and h = OR. 
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We start with some notation and definitions. Let f,h: {—1, +!}'■'— > {—1,+!} 
be given Boolean functions. Recall that / is called a subfunction of h if for some 
fixed strings y,z^ {—1, +1}*^, one has 

f{x) = h{...,{xiAyi)\/Zi,...) 

for each x g {— l,+l}*.In words, / can be obtained from h by replacing some of 
the variables xi,X2, ■ ■ ■ ,Xk with fixed values (—1 or +1). 

Definition 3.19. A function F : {-1,+1}'-' {-1, +1} is AND-reducible if 
for each pair of indices i, j, where 1 ^ / ^ j ^ k, at least one of the eight 
functions 



Xi 




Xi 


VXj 




AX", 


Xi 


VX] 


Xi 


AXy, 


xj 


VXj 


x7 


AX", 


xi 


VX] 



is a subfunction of F(x). 

Theorem 3.20. Let f\, fi, . . . , fk be nonconstant Boolean functions on finite sets 
Zi, Z2, . . . , Z<. c M", respectively. Let F:{-1, +lf {-1, +1} be an AND- 
reducible function. Put d = deg±(F(/i, /2, . . . , fk)). Then 



J^R+(f,D) < 1 



! = 1 



forD = Sdloglk. 

Proof. Since F is AND-reducible, it follows that for each pair of indices / < j, 
one of the following eight functions is a subfunction of , . . . , /t): 



fi ^ fj^ 




fi ^ Jj^ 






fi, 







By Theorem 3.17 (and the opening remarks of this section), 

/?+(/, , 4J) + /?+(/, , 4 < 1. 



The remainder of the proof is identical to the proof of Theorem 3.18 
equation (336 1. 



starting at 

D 
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In summary, the development in S ection [33] naturally extends to compositions 



F{f\, fi, ■ ■ ■ , fk) for various F. For a function F : {—1, +1}'^ {—1, +1} to 
be AND-reducible, F must clearly depend on all of its inputs. This necessary 
condition is often sufficient, for example when F is a read-once AND/OR/NOT 



formula or a halfspace. Hence, Theorem 1.5 from the Introduction is a corollary of 
Theorem 13. 201 

Remark. If more information is available about the combining function F, The- 
orem 3.20 can be generalized to let some of fi, . . . , fk be constant functions. For 



example, some or all of the functions /i , . . . , /j. in Theorem 3.18 can be identically 
true. Another direction for generalization is as follows. In Definition 3.19[ one 



considers all the distinct pairs of indices (/, j). If one happens to know that 



to 



fi is harder to approximate than fi, ■ ■ ■ , fk, then one can relax Definition 3.19 
examine only the k — I pairs (1, 2), (1, 3), . . . , (1, k). We do not formulate these 
extensions as theorems, the fundamental technique being already clear. 



3.5 Additional observations 



Analogous to Section 3.1 our results here can be viewed as a technique for proving 
lower bounds on the threshold degree of composite functions F{f\, f2, . . . , fk)- 
We make this view explicit in the following statement, which is the contrapositive 
of Theorem 13. 201 

Theorem 3.21 . Let fi, fi, ■ ■ ■ , fk be nonconstant Boolean functions on finite sets 
XuX2, . . . , c M", respectively. Let F : {-1, -hl}^ {-1, +1} be an AND- 
reducible function. Suppose that ^ R'^{f, D) ^ I for some integer D. Then 



deg±(F(/i,/2,..., A)) > 



D 



81og2yt 



(3.38) 



Remark 3.22 (On the tightness of Theorem 3.21 1. Theorem 3.21 is close to op- 
timal. For example, when F = AND, the lower bound in (|3.38|) is tight up to a 



factor of &(k logk). This can be seen by the well-known argument [9] described in 
the Introduction. Specifically, fix an integer D such that ^ /?"*"(/, , D) < 1. Then 
there exists a rational function pi{xj)/qi{xi) on X,-, for / = 1,2,..., k, such that 
qi is positive on X, and 



1=1 



max 



Pi(Xi) 



< 1. 
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As a result, 

' k-\ + 
Multiplying by n^' (-^') yields 



k I k k 

/\ fi(xi) = sgn I (k - l)Ylqi(xi) + ^ piixi) Yl \ ' 



i = l 



i = \ 



i = \ 



whence deg±(/i A/2 A- • • a/j.) ^ kD. This settles our claim regarding F = AND. 
For arbitrary AND-reducible functions F : {—1, +1}^ — > {—1, +1}, a similar ar- 



gument (cf. Theorem 3 1 of Klivans et al. EOl ) shows that the lower bound in ( 3.38 1 
is tight up to a polynomial in k. 



We close this section with one additional result. 

Theorem 3.23. Let f: X {-1, +1} be a given function, where X c M" is 
finite. Then for every integer k ^ 2, 



deg±(/ A / A • • . A /) ^ (S^logfc) • deg±(/ A /). 



(3.39) 



Proof Put d = deg±(/ A /). Theorem |3.17| implies that R+(f,4d) < 1/2, 
when ce /? +(/, log ^) < 1/^ by Corollary 3.14 By the argument in Re- 
mark 3.22 this proves the theorem. D 



To illustrate, let be a given class of functions on {— 1, -j-1}", such as half- 



spaces. Theorem 3.23 shows that the task of constructing a sign-representation for 



the intersections of up to k members from 'tf reduces to the case k = l.\n other 
words, solving the problem for k = 2 essentially solves it for all k. The dependence 



on k in (3.39 1 is tight up to a factor of 16 log k, even in the simple case when / is 
the OR function EOl. 



4 Rational approximation of a halfspace 

In this section, we determine how well a rational function of any given degree can 



approximate the canonical halfspace. The lower bounds in Theorem 1.6 the main 



result to be proved in this section, are considerably more involved than the upper 
bounds. To help build some intuition in the former case, we first obtain the upper 



bounds (SectionHjj) and only then prove the lower bounds (Sections 4.2 and 4.3 1 
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4.1 Upper bounds 

As shown in the Introduction, the OR function on n bits has /?+(0R, 1) = 0. A 
similar example is the ODD-MAX-BIT function /: {0, 1}" {-1, -|-1}, due to 
Beigel [81, defined by 



Indeed, letting 



fix) = sg 



1 + ZLi M'Xi 



we have ||/ — AmIIoo ^ OasM ^ oo.Thus, /?+(/, 1) = 0. With this construction 
in mind, we now turn to the canonical halfspace. We start with an auxiliary result 
that generalizes the argument just given. 

Lemma 4.1. Let f: {0, ±1, ±2}" {-\,+\} be the function given by f(z) = 
sgn(l + X"=i Then 

/?+(/, 64) = 0. 

Proof. Consider the deterministic finite automaton in Figure [T] The automaton 
has two terminal states (labeled and "— ") and three nonterminal states (the 
start state and two additional states). We interpret the output of the automaton to 
be -hi and —1 at the two terminal states, respectively, and otherwise. A string 
z = (Zn, Zn-\, ■ ■ ■ , Zi, 0) G {0, ±1, zb2}"+\ when read by the automaton left to 
right, forces it to output exactly sgn(^"^( 2'z,). If the automaton is currently at 
a nonterminal state, this state is determined uniquely by the last two symbols 
read. Hence, the output of the automaton on input z = (z„, z„_i, . . . , Zi, 0) e 
{0, ±1,±2}"+' is given by 



for a suitable map a: {0, ±1, ±2}^ {0, — 1, -|-1}, where we adopt the shorthand 

Zn+l = Zn+2 = Z0 = 0. Put 

, l + Z;U^' + '«fc+2,Z,+2,Z,) 



l + X'UM'+'\aizi+2,z,+2,zd\' 

By interpolation, the numerator and denominator of can be represented by 
polynomials of degree no more than 4 x 4 x 4 = 64. On the other hand, we have 
II/- AmIIoo ^ OasM^ oo. D 
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Figure 1: Finite automaton for the proof of Lemma 4. 1 



We are now prepared to prove our desired upper bounds for half spaces. 
Theorem 4.2. Let f: {-1, +1}"*^ {-1, +1} be the function given by 



/(x) = sgn( 1 + 2^2^2% 



Then 



R+ if, 64k [log + 1) = 0. 
In addition, for all integers d ^ I, 

R+{f,d) ^ 1 - (A:2"+')-i/^. 



(4.1) 



(4.2) 



(4.3) 



In particular, Theorem 4.2 settles all upper bounds on rdeg^ (/) in Theorem 1.6 



Proof of Theorem 4.2 Theorem 2.4 immediately implies (|4.3|) in view of the rep 



resentation ( |4.1| ). It remains to prove ( |4.2| ). In the degenerate case ^ = 1, we have 
/ = jc„i and thus ( |4.2| ) holds. In what follows, we assume that k ^ 2 and put 
A = [log k~\ . We adopt the convention that x,-,- = for / > n. For £ = 0, 1 , 2, . . . , 
define 



A *: 



' = 1 7 = 1 



Then 

n k 



/ = ! J = \ 

+ {2^Si + 2^^S3 + 2^^S5 + 2^^57 + 



(4.4) 
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Now, each is an integer in [—2^^^ + 1, 2^'^ — 1] and therefore admits a represen- 
tation as 

Si = zt,i + 2ze,2 + 2\^,3 + • • • + 22^-'z^,2A, 

where zt,\, ■ ■ ■ , Ze,2A g {—1,0, +1}- Furthermore, each Sc only depends on ^ A of 
the original variables Xij, whence zc,i, ■ ■ ■ , Zf,2A can all be viewed as polynomials 
of degree at most A: A in the original variables. Rewriting ( |4.4| ), 

n k 

, = 1 ; = 1 / \'^A + 1 



for appropriate indexing functions {(i), f(i), Thus, 

/(x) = sgn I 1 + 2] 2' zai)j(i) + ^ 2' [ziiojn) + Ze'{i)j 



i = l 



i^A + 1 



Since the underbraced expressions range in {0, zfcl , ±2 } and a re po lynomials of 
degree at most ^ A in the original variables. Lemma 4. 1 implies ( 4.2 1. D 



4.2 Preparatory work 

This section sets the stage for our rational approximation lower bounds with some 
preparatory results about halfspaces. It will be convenient to establish some ad- 
ditional notation, for use in this section only. Here, we typeset real vectors in 
boldface (xi, X2, z, v) to better distinguish them from scalars. The ith component 
of a vector x g M" is denoted by (x), , while the symbol x,- is reserved for an- 
other vector from some enumeration. In keeping with this convention, we let e, 
denote the vector with 1 in the ith component and zeroes everywhere else. For 
X, y G M", the vector xy g M" is given by (xy), = (x)/(y);. More generally, for a 
polynomial p on M*^ and vectors Xi, . . . , X|t g M", we define p(xi, . . . , x^,) g W 
by (p(xi, . . . , Xi)), = p((xi)i, . . . , (xk)i). The expectation of a random vari- 
able X G R" is defined componentwise, i.e., the vector E[x] g M" is given by 
(E[x]),- = E[(x),]. 

For convenience, we adopt the notational shorthand a° = 1 for all a g M. In 
particular, if x g M" is a given vector, then x° — (1, 1, . . . , 1) g M". A scalar 
a G M, when interpreted as a vector, stands for (a, a, . . . , a). This shorthand 
allows one to speak of span{l, z, z^, . . . , z*^}, for example, where z g M" is a given 
vector. 
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Theorem 4.3. Let N andm be positive integers. Then reals uq, ai, . . . , a4„, exist 
withthe following property: for eachh g {0, l}'^ , there is a probability distribution 
fib on {0, ±1, . . . , ±m}^ such that 

E [(2v + b)^] = {aa, a^, . . . , a^), = 0, 1, 2, . . . , 4m. 

Proof. Let and X\ be the distributions on {0, ±1, ... , ±m} given by 



Then for J = 0, 1, ... , 4m, one has 
E [{2tY] - E [{2t + 1)'^] 



li(0 = 16" 



, / 4m + 1 \ 
V2m + 2t + 



4m+l I i\ 

= 16- ^ (-l)'Y )(f - 2m/ = 0, (4.5) 



wh ere ( 4.5 1 holds by Fact 2.1 Now, let /^b = ''-(b)i x '^(b)2 x • • • x /l(b)„ . Then in view 
of (|43|), the theorem holds by letting aa = E^^ [(20^^] for J = 0, 1 , 2, . . . , 4m . D 



Using the previous theorem, we will now establish another auxiliary result 
pertaining to halfspaces. 

THEOREM4.4. Putz= (-2^-2"-^...,-2^2°, ...,2"-',2") & R^''+^ . There 
are random variables Xi, X2, . . . , x„+i g {0, ±1, ±2, . . . , ±(3?i + l)}2"+2 such 
that: 



n+l 

i=l 



(4.6) 



E 



n 

./■=i 



span{(l, 1, . . ., 1)} 



(4.7) 



fordi, . . . ,4 G {0, 1, . . .,4n}. 
Proof. Let 



X; = 2y; - y/_i + e„+i+/ - e„+2-i, / = 1, 2, . . . , n + 1, 
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where yo, yi, . . . , y„+i are suitable random variables with yo = y„_|_i = 0. Then 
property (4.6 1 is immediate. We will construct yo, yi, . . . , y„+i such that the re- 
maining property (4.7 1 holds as well. 

Let N = 2n + 2 and m = n in Theorem |4. 3 1 Then reals ao, ai , • • - , ot4n exist 
with the property that for each b g {0, l}^""*"^, a probability distribution /uy, can be 
found on {0, ±1, ... , ±nf"+^ such that 



E [(2v + br] = «rf(l,l,...,l), 



0, l,...,4n. 



(4.8) 



Now, we will specify the distribution of yo,yi, ■ ■ ■ ,yn by giving an algorithm 
for generating y, from y,_i. First, recall that yo = y„+i = 0. The algorithm for 
generating y, given y,_i (/ = 1, 2, . . . , n) is as follows. 

(1) Let u be the unique integer vector such that 2u — y,_i + e„+i+/ — e„+2-; £ 

{0, l}2"+2. 

(2) Draw a random vector v ~ jUh, where b = 2u — y,_i + e„+i+, — e„+2-!- 

(3) Set y,- = V + u. 

One easily verifies that yo, yi, . . . , yn+i g {0, ±1, ... , ±3?i}^"+^. 

Let R denote the resulting joint distribution of (yo, yi, . . . , y„+i). Let / ^ n. 
Then conditioned on any fixed value of (yo, yi, . . . , y,_i) in the support of R, the 
random variable x, is by definition independent of Xi, . . . , x,_i and is distributed 
identically to 2v + b, for some fixed vector b g {0, 1}^"+^ and a random variable 
V ~ //b- In view of (|4.8[), we conclude that 



E 



L/=l 



(l,l,...,l)fjarf, 



(=1 



for all (ii, (i2, ■ ■ ■ , <3?n g {0, 1, . . . , 4?i}, which establishes (4.7 1. It remains to note 
that xi, X2, . . . , x„ G {—2n, —2n + 1, . . . , — 1, 0, 1, . . . , 2?i, 2?i + 1}^"+^, whereas 
x„+i = -y„ + e2„+2 - ei G {0, ±1, . . . , ±(3n + D 



At last, we arrive at the main theorem of this section, which will play a crucial 
role in our analysis of the rational approximation of halfspaces. 

Theorem 4.5. For / = 0, 1, 2, . . . , «, define 



Ai = 



n+l 



(xi,...,x„+i)g{0,±1,...,±(3« + 1)} 



Y,2^-'^j = 2! 
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Let p{x\, ... ,x„_|_i) be a real polynomial with sign (—1)' throughout A,- (/ = 
0, 1, 2, . . . , «) and sign (—1)'+' throughout — A,- (/ =0, \,2, . . . ,n). Then 

deg p In + \. 

Proof. For the sake of contradiction, suppose that p has degree no greater than 2n . 
Put z = (-2", -2"-^ . . . , -2", 2°, . . . , 2"-\ 2"). Let Xi, . . . , x„+i be the random 



variables constructed in Theorem 4.4 By (4.7 1 and the identity x„_|_i = 2 "z — 
X"=i 2'~"~'x/, we have 

E[p(xi, . . . ,x„+i)] e span{l,z, z^ . . .,z^"}, 

whence E[j)(xi, . . . , x„_|_i)] = q{z) for a univariate polynomial q g P2n- In view 



of (4.6l and the assumed sign behavior of p, we have sgng(2') = (—1)' and 
sgn^(— 2') = (— 1)'"*"', for / = 0, 1, 2, . . . , Therefore, q has at least In + I 
roots. Since q g P2«, we arrive at a contradiction. It follows that the assumed 
polynomial p does not exist. D 



Remark 4.6. The passage p i-^ q in the proof of Theorem 4.5 is precisely 
the linear degree-nonincreasing map M : ]R.[xi, xj, . . . , — > M.[x] described 
previously in the Introduction. 

4.3 Lower bounds 

The purpose of this section is to prove that the canonical halfspace cannot be 
approximated well by a rational function of low degree. A starting point in our 
discussion is a criterion for inapproximability by low-degree rational functions, 
which is applicable not only to halfspaces but any odd Boolean functions on Eu- 
clidean space. 

Theorem 4.7 (Criterion for inapproximability). Fix a nonempty finite subset S c 
M'" with Sn-S = 0. Define / : S U -5 ^ {-1, -hi} by 



+ \, X E S, 

— 1, X G —S. 



Let y/ be a real fimction such that 

y/{x) > 8\iij{-x)\, xeS, (4.9) 

for some S & (0, 1) and 

^ y/ix)u(x) = (4.10) 

su-s 
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for every polynomial u of degree at most d. Then 

2d 

Proof Fix polynomials p,q of degree at most d such that q is positive on 5 U — 5. 
Put 

. fix) 
fix) 



e = max 

SU-5 



q{x) 



We assume that e < I since otherwise there is nothing to show. For x g 5, 

{\ - e)q{,x) ^ p{x) ^ {\ + e)q{x) (4.11) 

and 

(1 - e)q{-x) ^ -p{-x) ^ (1 + e)q{-x). (4.12) 



Consider the polynomial u{x) = q(x) + q{—x) + p(x) — p{—x). Equations (4.1 1 1 



and (4.12 1 show that for x & S, one has u(x) ^ (2 — e){q{x) + q(—x)} and 
^ ^{qix) + qi—x)}, whence 



u{x) ^ 
We also note that 



0-0 



u{-x)\, 



X & S. 



(4.13) 



u(x) > 0, 

Since u has degree at most d, we have by ( 4. 10| ) that 

^^^{y/(x)u{x) + y/(—x)u(—x)} = y/{x)u{x) = 0, 



(4.14) 



xeS 



su-s 



whence 



y/(x)u{x) ^ \ y/(—x)u(—x)\ 



for some x g 5. At the same time, it follows from ( |4.9[ ), ( |4.13| ), and ( |4.14[ ) that 

y/(x)u(x) > ^ ^ 1^ \ w(~x)u(—x)\, X G 5. 

We immediately obtain d{{2/e} — 1) < 1, as was to be shown. D 
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Remark 4.8. The method of Theorem 4.7 amounts to reformulating (4.131 and 
(4. 14 1 as a linear program and exhibiting a solution to its dual. The presentation 
above does not explicitly use the language of linear programs or appeal to duality, 
however, because our goal is solely to prove the correctness of our method and not 
its completeness. 



Using the criterion of Theorem |4.7| and our preparatory work in Section |4.2[ 
we now establish a key lower bound for the rational approximation of halfspaces 
within constant error. 



Theorem 4.9. Let f: {0, ±1, . . . , ±(3?i + 1)}"+' {-I, +1} be given by 

fix) = sgr 

Then 



/?+(/,«) = Q(1). 



Proof. Let Aq, Ai , . . . , A„ be as defined in Theorem 4.5 Put A = |J A, and define 
g: AU-A^ {-l,+l}by 



six) 



i-iy, x&Ai, 



Then deg±(/) > 2n by Theorem 4.5 As a result, Theorem 2.2 guarantees the 
existence of a function ^ : A U —A — > M, not identically zero, such that 



<i>{x)g{,x) ^ 0, 



X G A U —A, 



and 



^ 4>{x)u{x) = 



(4.15) 



(4.16) 



AU-A 

for every polynomial u of degree at most 2n . Put 

n+l 



«-l / 

n( 

,/=0 \ 



p(x) = n( -2'Vi + ^2'-'xi 



! = 1 



) 



and 



iij{x) = {-irmx) - cp{-x)}p{x). 
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Define S = A\y/~^{0). Then S by ( |4.15| and the fact that ^ is not identically 
zero on A U — A. For x & S, we have i//(—x) ^ and 



b(x)| 



W{-x)\ b(-x)| 



exp(-9v^), 



where the final step uses t he bo und {a — l)/(a + 1) > exp(— l.S/a), valid for 
a ^ \fl. It follows from (4.15 I and the definition of p that \f/ is positive on S. 
Hence, 



(x) > exp(-9V2) |i//(-x)|, 



(4.17) 



For any polynomial u of degree no greater than n, we infer from (4.16 1 that 

X xii{x)u{x) = (-1)" 2] {0(x) - <;^(-x)}m(x)p(x) = 0. (4.18) 

SU-S AU-A 

Since / is po sitive on S and neg ative on —5, the proof is now complete in view of 
k.n\, k.lS\, and Theoremli??! D 



We have reached the main result of this section, which extends Theorem |4.9| to 
any subconstant approximation error and to halfspaces on the hypercube. 

Theorem 4.10. Let F : {-1,+ir' {-I, +\} be given by 



Then for d < m/14, 



F(x) -sgn I 1 + 2^2^2% ) . 

/ = 1 7 = 1 



R(F,d) ^ 1 _2-®(»'/rf)_ 



(4.19) 



Observe that Theorem 14. 1 01 settles the lower bounds in Theorem 11.61 from the 
Introduction. 



Proof of Theorem 4.10 We may assume that m ^ 14, the claim being trivial oth- 
erwise. Consider the function G : {-1, -|-l}(«+i)(6"+2) ^ +i} given by 

(n + l 6n+2 
i=l j=l 
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where n = [(m — 2)/6J. For every e > R^{G,n), Proposition 2.7 provides a 
rational function A on M"+^ of degree at most n such that, on the domain of G, 



6«+2 



G{x) - A 




< e 



and the denominator of A is positive. Letting / be the function in Theorem |4.9[ 
it follows that |/(xi, . . . , x„_|_i) — A{2x\, . . . , 2x„_|_i)| < e on the domain of /, 
whence 



R+{G,n) = Q(l). 



(4.20) 



We now claim that either G{x) or — G(— x) is a subfunction of F. For example, 
consider the following substitution for the variables x,y for which i > n + \ ox 
j > 6n + 2: 



^ (-ly, 
xij ^ (-ly+^ 



(1^7^ m), 

{n + \ < i < m, 1 ^ J ^ m), 
(1 ^ / ^ « + 1, j > 6n + 2). 



After this substitution, F is a function of the remaining variables x,y and is equiva- 
lent to G(x) if m is even, and to — G(— x) if m is odd. In either case, (4.201 implies 
that 



R+{F,n) = a{l). 

Theorem 12.5 1 shows that 

.(F,./2),l-(l^^) 



(4.21) 



l/L«/(2d)J 



forJ = 1,2, . . ., L«/2J, which yields (4. 19 1 in light of (2.2 1 and (4.21 1. 



5 Rational approximation of the majority function 

The goal of this section is to determine 7?+(MAJ„, d) for each integer d, i.e., to 
determine the least error to which a degree-J multivariate rational function can ap- 
proximate the majority function. As is frequently the case with symmetric Boolean 
functions such as majority, the multivariate problem of analyzing /?+(MAJ„, d) is 
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equivalent to a univariate question. Specifically, given an integer d and a finite set 
5 c M, we define 



R^{d, S) = inf max 



, Pit) 

sgnt 



q{t) 

where the infimum ranges over p,q g Pd such that q is positive on 5. In other 
words, we study how well a rational function of a given degree can approximate 
the sign function over a finite support. We give a detailed answer to this question 
in the following theorem: 

Theorem 5.1 (Rational approximation of majority). Let n,d be positive inte- 
gers. Abbreviate R = R'^(d, {±1, ±2, . . . , ±n}). For I ^ d ^ log?i, 

1 



® („i/(2rf))} 



e^P ] -® I riT?^ I [ ^ /? < exp ] - 



For log n < d < n. 



^ ^''Pj ®(log(2«/J))}- 



For d ^ n, 

R = 0. 

Moreover, the rational approximant is constructed explicitly in each case. 

Theorem 15.11 is the main result of this section. We establish it in the next 
two subsections, giving separate treatment to the cases d ^ log n and d > log n 



(see Theorems |5.3| and |5.8[ respectively). In the concluding subsection, we give 
the promised proof that R^{d, {±1, . . . , ±?i}) and /?"'"(MAJ„, d) are essentially 
equivalent. 

5.1 Low-degree approximation 



We start by specializing the criterion of Theorem 4.7 to the problem of approxi 



mating the sign function on the set {±1, ±2, . . . , ±?i}. 

Theorem 5.2. Let d be an integer, ^ d ^ 2n — \. Fix a nonempty subset 
S C {1, 2, . . . , «}. Suppose that there exists a real 5 e (0, I) and a polynomial 
r G P2n-d-\ that vanishes on {—n, ...,«} \ (S U —S) and obeys 

(-i)V(o > ^k(-0l, t^s. (5.1) 

Then 

/?+(J,5U-5) ^ (5.2) 

1 + 
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Proof. Define /: 5 U -5 ^ {-1, +1} by f{t) = sgnf. Define y/: SU-S^ 
by y/(t) = Then K]\ takes on the form 



i//(0 > d\y/{-t)\, t G S. 
For every polynomial u of degree at most d, we have 



\ "I" / 



(5.3) 



(5.4) 



by Fact 2.1 Now (5.2i is immediate from (5.3l, (5.4i, and Theorem 



4.7 



Using Theorem 5.2 we will now determine the optimal error in the approxima- 
tion of the majority function by rational functions of degree up to logn. The case 
of higher degrees will be settled in the next subsection. 

Theorem 5.3 (Low-degree rational approximation of majority). Lef d he an 
integer, 1 ^ <i ^ log?i. Then 



exp 



® („i/(2rf))} 



^ R^{d, {±1, ±2, . . . , ±«}) < exp 



1 



Aid 



Proof. The upper bound is immediate from Newman's Theorem 2.4 For the lower 
bound, put A = \n'l'^\ ^ 2 and 5 = {1, A, A^, . . . , A"*}. Define r g Pin-d-i by 



-(o=(-i)"n(^- a'Va) w {t-i). 

i=0 ie{-n,...,n]\iSU-S) 



For; =0,1,2,..., J, 

j-i 



k(-AOl 



A-i - A'VA y4 AVA - A-' 

i i A L A / . Aa~ i i 



. „ A' -j- A'VA ^ ^ A'VA -j- A 



/ /fi A'/^-l V 
/ ^ A'/2 + 1 J 



> exp 



A'/2 



> exp 



where we used the bound (a — l)/(a-|- 1) > exp(— 2.5/a), valid for a ^ v^. Since 
sgnr(f) = (—1)' for f g 5, we conclude that 



(-l)'r(O > exp 



\r{-t)\, 



t G 5. 



Since in addition r vanishes on {—n, . . . ,n}\ (5U — 5), we infer from Theorem |5.2 
that R+{d, 5 U -5) ^ exp{-18/VA}. 



D 
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5.2 High-degree approximation 

In the previous subsection, we determined the least error in approximating the 
majority function by rational functions of degree up to log?i. Our goal here is to 
solve the case of higher degrees. 

We start with some preparatory work. First, we need to accurately estimate 
products of the form ]^j (A' + 1)/(A' — 1) for all A > 1. A suitable Zower bound 
was already given by Newman ||3T1 Lem. 1]: 



Lemma 5.4 (Newman). For all A > 1, 

Aa' + i 

(=1 



2(A" - 1) 
A"(A - 1) 



Proof. Immediate from the bound {a + \)/{a — \) > exp(2/a), which is valid for 

a> \. D 



We will need a corresponding upper bound: 
Lemma 5.5. For all A > 1, 



f^A' + l 



; = 1 



A - 1 



Proof. Let ^ ^ be an integer. By the binomial theorem. A' ^ (A — 1)/ + 1 for 
integers / ^ 0. As a result, 

(=1 i=i ^ 



Also, 



11 A' - 1 ^ 1H^^+ (Ak+i - 1)A'7 
i=k+l i=o ^ ^ ' ' 



< exp 



2A 



(A 



k+\ _ 



1)(A-1) 



Setting k = k{A) = \_-^\ , we conclude that 



~ A' + l 



! = 1 



c 

A - 1 



where 



C = sup 

A>1 



(A-l)lnp^^>+r.^l) + 



2A 



Ak{&)+i - 1 



< 4. 
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We will also need the following binomial estimate. 
Lemma 5.6. Put pit) = n"=i {t - i - \) ■ Th^n 

Pi-t) 



max 

f=l,2,...,n+l 



Pit) 



^ 0(16"). 



Proof. For ? = 1 , 2, . . . , n + 1 , we have 



(2?-2)!(2n-2? + 2)! ?!(2n + 2? + l)! 

1/^(01 = TV-TTT' \pi.-t)\ = 



4«(? -!)!(«-? + 1)! 



4«(2?+ l)!(n + 0! 



As a result, 



/2. + 2. + 1W2. + 1X ef^efn 

+ 1 /2? - 2\ /2n - 2? + 2\ ^ /22"\ 



which gives the sought bound. D 

Our construction requires one additional ingredient. 

Lemma 5.7. Letn, d be integers, 1 ^ d ^ «/55. Consider the polynomial pit) — 
itZlit - ^/A'VA), where A = {n/df'^. Then 



min 



pi[_dLi\) 



Pi-Vd^j\) 



> exp 



A\nM 



\n{n/d) - 1 



Proof. Fix j — 1,2, ... ,d. Then for each j = 1 , 2, . . . , j — 1 , 

dAj - ^/A'Va ^ d (a^-'-2 - l) ^ ^ 0" - Oln^. 



and thus 



n ■ 

! = 1 ^ 



1 



dAJ - JAVA 



) 



^ exp 



^ exp 



ln(n /(i) 

4 In 3 J 
Inin/d) 



1 



(5.5) 
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For brevity, let ^ stand for the final expression in (5.5 1. Since I ^ d ^ n/55, we 
have [(iA 'J — (iA '~'v^ > 1. As a result, 



pi-ldAJj) 



■' ^ dA-i - I - dA'JA i-l d A' Ja - d A-i 



dAi + (3fA'VA 



n 



dA'^/A+dA-i 



n 

'■=1 

tidAj-dA'J^ l^t/A'VA -i/A^' 

i i ^ A / I ^ A ! Aa~ i i 



. ,dAJ+dAwA ^^dAWA + dAJ 



> ^ 

^ ^ exp ] - 



Ta-iJ ' 



where the last inequality holds by Lemma 5.5 



by (5.5 1 



We have reached the main result of this subsection. 



Theorem 5.8 (High-degree rational approximation of majority). Let d be an 
integer, log n < d ^ n — I. Then 

«n<'.l±l.±2....,±„|)=expj-0(j;_^)|. 

Also, 

/?+(n,{±l,±2, ...,±?i}) = 0. 

Proof. The final statement in the theorem follows at once by considering the ratio- 
nal function {p(t) - p(-t)}/{p(t) + p(-t)}, where p(t) = l\"=i(^ + 0- 
Now assume that log« < d < n/55. Let 

/n\Wd 

* = (;/) ■ 

Define sets 

51 = {\,2,...,k}, 

52 = {ldA'} : i = \,2,---,d}, 
5 = 5i U 52. 



d 

login /d) 
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Consider the polynomial 

r(0 = (-irri(0r2(0 



n - 0, 

iG{-n,...,n]\{SU-S) 



where 



We have: 



nit) 



Yllt-i--], r2(o = n(^-^^''^)- 

i=l ^ ^ 1=1 



mm 



r(t) 



ri-t) 



^ min 

i = l,...,k+l 



> exp 



n(0 



rii-i) 
Cd 



mm 

i=l,...,d 



rii-ldA']) 



login /d) 



by Lemmas 5.6 and 5.7 where C > is an absolute constant. Since sgn pit) 
(— 1)' for f G 5, we can restate this result as follows: 

Cd 



(-l)'r(O > exp 



\ri-t)\, t&S. 



login /d) 

Since r vanishes on {— n, . . . , n} \ (5 U —5) and has degree ^ 2n — l—d, we infer 
from Theorem 5.2 that R'^id, S U —S) ^ exp {— C(i/log(«/(i)} . This proves the 
lower bound for the case log n < d < n /55. 

To handle the case «/55 ^ d ^ n — I, a. different argument is needed. Let 



r(0 



d , 1\ " 

=(-1)"? n(^-'-2) n 



By Lemma 5.6 there is an absolute constant C > 1 such that 
r(0 



ri-t) 



1,2,..., J+ 1. 



Since sgnr(f) = (— 1)' for f = 1, 2, . . . , <i + 1, we conclude that 



(-l)V(O > C-^'|r(-OI, 



l,l,...,d+l. 



Since the polynomial r vanishes on {—n, ...,«} \ {±1, ±2, . . . , ±((i + 1)} and has 
degree 2« — 1 — J, we infer from Theorem |5. 2 [that 



R+id, {±1, ±2, . . . , ±(J + 1)}) ^ C"". 
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This settles the lower bound for the case n /55 ^ d — \. 

It remains to prove the upper bound for the case log n < d ^ n — I. Here we 
always have d ^ 2. Letting k = [d/l] and A = (n/kY^'^, define p g P2k by 



p(t) = llit + i)ll(t + kA'). 



! = 1 



1 = 1 



Fix any point t g {1,2, ... ,n} with p{—t) ^ 0. Letting /* be the integer with 

kls.'* < t < /:A''+\ we have: 



Pit) 
\p{-t)\ 



kA''+^-kA' ytA' -ytA'* " A' - 1 
(=0 ;=(■*+! (=1 



-i^kA''+^ +kA' -pW kA'+kA'' 



> exp 



2(A'-- - 1) 
AHA - 1) 



where the last inequality holds by Lemma 5.4 Substituting A = {n/ky/^ and 
recalling that k ^ 0(log?i), we obtain p{t) > A\p{—t)\ for f = 1,2, 
where 



A = exp I 



(log(n/^))} 



As aresult, R^{2k, {±1, ±2, . . . , =b?i}) ^ 2A/(A- + 1), the approximant in ques- 
tion being 



1 p{t)-p{-t) 



A2 + 1 p{t) + p{-ty 



D 



5.3 Equivalence of the majority and sign functions 

It remains to prove the promised equivalence of the majority and sign functions, 
from the standpoint of approximating them by rational functions on the discrete 
domain. We have: 

Theorem 5.9. For every integer d. 



/?+(MAJ„,J) ^ R+{d - 2,{±\,±2, . . . ,±\n/2^}), 
R+{MKi„,d) ^ R+{d,{±\,±2,...,±\n/2\}). 



(5.6) 
(5.7) 

Proof. We prove ( |5.6| ) first. Fix a degree-((i — 2) approximant p(t)/q(t) to sgn? 
on 5 — {±1, . . . , ± /2] }, where q is positive on S. For small ^ > 0, define 



As(t) = 



f-p{t)-8 
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Then As is a rational function of degree at most d whose denominator is positive 
on 5 U {0}. Finally, we have A^(0) = — 1 and 



lim max | sen t — Asit)\ = max 

d->0 re5 teS 



sgnt - 



Pit) 



q{t) 



Then As{\ "Y^ixi + \) — \n /2\) is the desired approximant for MAJ„(:ti, . . . ,x„). 



We now turn to the lower bound, (5.7 1. For every e > /?"'"(MAJ„, d), Proposi- 



2.7| gives a univariate rational function p(t)/q{t) of degree at most d such that 



tion 

for all X G {—1, +1}", one has 

MAJ„(x) - 
and^(X-^/) > 0- Then 



< e 



max 

f=±l,±2,...,±L«/2J 



sgn? 



p{2t + n- 2L?i/2J) 



c](2t + n - 2L«/2J) 



completing the proof of (5.7 1. 



D 



Note that ( 2.2 1 and Theorems 5.3 5.8 and |5.9| immediately imply Theorem 1 .7 
from the Introduction. 



Remark 5.10. The proof that we gave for the upper bound, ( |5.6| ), illustrates a 
useful property of univariate rational approximants A(t) = p{t)/q{t) on a finite 
set S. Specifically, given such an approximant and a point t* ^ S, there exists an 
approximant A' with A'(t*) = a for any prescribed value a and A' ^ A everywhere 
on S. One such construction is 

^ ^ (t-np(t) + ad 
{t-t*)q{t) + d 

for an arbitrarily small constant d > Q. Note that A' has degree only 1 higher than 
the degree of the original approximant, A . This phenomenon is in sharp contrast to 
approximation by polynomials, which do not possess this corrective ability. 



6 Intersections of halfspaces 

In this section, we prove our main theorems on the sign-representation of inter- 
sections of halfspaces and majority functions. In the two subsections that follow, 
we give results for the threshold degree as well as threshold density, another key 
complexity measure of a sign-representation. 
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6.1 Lower bounds on the threshold degree 

We start by formalizing the elegant observation due to Beigel et al. lH, already 
described briefly in the Introduction. 

Theorem 6.1 (Beigel, Reingold, and Spielman). Let f : X {-1,+1} and 
g: Y {—l,+\}be given functions, where X, F c M" are finite sets. Let d be 
an integer with R^{f, d) + R^{g, d) < \. Then 



deg^if Ag)^2d. 

Proof. Fix rational functions pi(x)/qi{x) and P2{y)/Q2iy) of degree at most d 
such that qi and qj are positive on X and Y, respectively, and 



max 

xeX 



qiix) 



+ max 



8(y)- 



qiiy) 



< 1. 



Then 



fix) A giy) = sgn{l + f{x) + g{y)} = sgn 



qi(x) qiiy) 



Multiplying the last expression by the positive quantity q\{x)q2{y), we obtain 
fix) A giy) = sgn{^i(x)<72(3') + P\{x)q2{y) + P2{y)qi{x)}. D 



Recall that Theorem |3. 17 gives an essentially exact converse to Theorem 6.1 
We are now in a position to prove our main results on the threshold degree. 



Theorem 6.2 (restatement of Theorems 1.8 and l.lOl. Consider the function 
/: {-1,+ir' ^ {-I, +\} given by 



/(^) = sgn 1 + 2^X2% 

Let g: { — 1, +1}" — > {—1, +1} be the majority function on n bits. Then 

deg±(/A/) = Q(«), 
deg±(^ A ^) = Q(log?i). 



(6.1) 
(6.2) 



Proof. By Theorem 4.10 we have R'^{f,en) ^1/2 for some constant e > 0, 
which settles ( |6.1| ) in view of Theorem |3.17[ 

Analogously, Theorems 5.1 and 5.9 show that R'^j g, e lo gn) ^1/2 for some 
constant e > 0, which settles (6.2 1 in view of Theorem 3.17 D 
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Remark 6.3. The lower bounds (|6.1|) and (|6.2|) are tight and match the con 



structions due to Beigel et al. fOl. These matching upper bounds can be seen as 
follows. By Theorem |4.2[ we have R^{f, Cn) < 1/2 for some constant C > 0, 
which shows that deg±(/ a /) = 0{n) in view of Theorem 6.1 Analogously, 



Theorems 5.1 and |5.9| imply that R^{g, Clogn) < 1/2 for some constant C > 0, 
which shows that deg±(^ a ^) = C?(log n) in view of Theorem |6.1 



Furthermore, Theorem |6.1| generalizes immediately to conjunctions of ^ = 



3 and more functions. In particular, the lower bounds in (6.1 1 and (6.2i remain 
tight for intersections / A / A • • • A / and g A g A ■ ■ ■ A g featuring any constant 
number of functions. 

We give one additional result, featuring the intersection of the canonical half- 
space with a m ajority function. 

Theorem 1.9 (restated). Let f: {-1, +1}"' ^ {-\,+\}be given by 



/W = sgn I 1 + 2^2^2% j . 



Let g: { — 1,-1-1}'^^^ — > { — \,+\} be the majority function on \^/n] bits. Then 

deg±(/A g) = 0(V^). (6.3) 
Proof. We pro ve the lower bound first. Let e > be a suitably s mall c ons tant By 



Theorem 



4.10 



we have R~^{f, e.y/n) ^1 — 2 By Theorems 



3.17 



5.1 



and 



5.9 



we 



these two facts imply that 



have R'^ig, e.y/n) ^2 In view of Theorem 
deg^if Ag) = Q(^). 

We now turn to the upper bound^ It is clear that R'^ig, Is/nl) = and 
R+ (/, 1) < 1 . It follows by Theorem [el] that deg±(/ a g) = O (V^). D 



6.2 Lower bounds on the threshold density 

In addition to threshold degree, several other complexity measures are of inter- 
est when sign-representing Boolean functions by real polynomials. One such 
complexity measure is density, i.e., the number of distinct monomials in any 
polynomial that sign-represents a given function. Formally, for a given function 
/ : {—1, -hi}" — > {—1, -|-1}, the threshold density dns(/) is the minimum k such 
that 

f(x) = sgn I X'^' ri-^j I 
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for some sets Si, . . . , Q {1,2, ... ,n} and some reals li, . . . , Ik- We will show 
that intersections of two halfspaces not only have high threshold degree but also 
high threshold density. 

We start with the conjunction of two majority functions. Constructions in ||9l 
show that the function f(x, y) = MAJ„(;c) A MAJ„(j) can be sign-represented 
by a linear combination of n^^^°&'^) monomials, namely, the monomials of degree 
up to C?(log n). Klivans and Sherstov [24, Thm. 1.2] complement this with a lower 
bound of (log "/log log") on the number of distinct monomials needed. Our next 
result improves this lower bound to a tight ?i®('°g"). 

Theorem 6.4. Let f: {-1,+ir x {-1,+1}" {-1,+1} be given by 
fix, y) = MAJ„(;ci, . . . , ;c„) A MAJ„(3;i, . . . , y„). Then 

dns(/) = 

Proof. Identical to the proof of Klivans and Sherstov Il24l §3.3, Thm. 1.2], with 



the only difference that Theorem 1.10 should be invoked in place of O'Donnell 



and Servedio's earlier result |33| thatdeg±(/) = Q(log?i/loglog?i). D 

We will now derive an exponential lower bound on the threshold density 
of the intersection of two halfspaces. For this, we recall an elegant procedure 
for converting Boolean functions with high threshold degree into Boolean func- 
tions with high threshold density, discovered by Krause and Pudlak lf26l . Their 
construction maps a given function / :{— 1,-|-1}" — > {—1,-1-1} to the function 
({-1,+1}")3^ {-1,+1} given by 



f^(x, y, z) = /(... , (zi A Xi) V (zi A yi), ...). 



We have: 



Theorem 6.5 (Krause and Pudlak Ii26> Prop. 2.1]). For every function 
f: {-1,+1}"^ {-1,+1}, 

dnsC/"^) ^ 2'*^s±(«. 
Another ingredient in our analysis is the following observation. 



Lemma 6.6 (Klivans and Sherstov 1241). Let f : {-1,-|-1}" {-1,-1-1} be 
a given function. Consider any function F : { — 1,-|-1}'" — > { — 1,-1-1} given by 
F(x) = fixii^), ■ ■ , Xnix)), where each Xi is a parity function { — 1, +1}"^ — > 
{—\, +1} or the negation of a parity function. Then 

dns(/) ^ dns(F). 
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Proof (Klivans and Sherstov 11241 ). Immediate from the definition of thresliold 
density and the fact that the product of parity functions is another parity func- 
tion. D 

We aie now in a position to prove the desired result for halfspaces. 
Theorem 6.7. Let f„: {-1,-1-1}"^ {-I, +1} be given by 



fn(x) = sgn j 1 + ^ ^ 2'xij 

\ '• = ! 7 = 1 

Then 

dns(/„ A /„) = exp{Q(«)}, (6.4) 
dns(/„ A MAJp^^) = exp{Q(v^)}. (6.5) 

Remark 6.8. In the proof below, it will be useful to keep in mind the follow- 
ing straightforward observation. Fix functions /,§:{—!, -|-1}* — > {—1,-1-1} 
and define functions f , g' : {-1,-hl}* {-1,+1} by /'(x) = -/(-x) and 
g'iy) = -^(-j). Then we have f'(x)Ag'(y) = -(f(-x)Agi-y))f(-x)gi-y), 
whence dns(/' A g') ^ dns(/ A g) dns(/) dns(g) and thus 

dns(/' A g') 

dns(/ Ag)^ —. (6.6) 

dns(/) dns(g) 

Similarly, we have f(x) A g'iy) = (f{x) A g{-y))f(x), whence 

dns(f A g') 

dns(/Ag)^ ^ • (6-7) 

dns(/) 



To summarize, (6.6 1 and (6.7 1 allow one to analyze the threshold density of f A g 
by analyzing the threshold density of /' A g' or f A g instead. Such a transition 
will be helpful in our case. 

Proof of Theorem^ Put m = \n/A\. The function f„^: ({-1,-1-1}'"')^ ^ 
{—1,-1-1} has the representation 

(mm \ 
'=1 7=1 / 
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As a result, 



dns(/4„, A /4„,) ^ dns A f,n^^) 
= dns((/,„ A /,„)'^P) 

^ exp{Q(m)} 



by Lemma 6.6 



by Theorem 6.5 
by Theorem |6. 2 



By the same argument as in Theorem 4.10 the function /4,„ is a subfunction of 



/„ (x) or — /„ (— x). In the former case, ( 6.4 1 is immediate from the lower bound on 
dns(/4,„ A /4,„)- In the latter case, ( |6.4| ) follows from the lower bound on dns(/4„, A 
/4m) and Remark 6.8 

The proof of (6.5 1 is entirely analogous. D 



Krause and Pudlak's method in Theorem 6.5 naturally generalizes to lin- 
ear combinations of conjunctions rather than parity functions. In other 
words, if a function /: {— 1,-|-1}" — > {— 1,+1} has threshold degree d and 
f^^{x,y,z) = sgn{'^f^^^iTi{x,y,z)) for some conjunctions Ti,...,Tn of 
the literals xi, yuZi, . . . ,x„, y„,z„, -■xi, -^ji, -^zi, . . . , -^x^, -^y,,, ^Zn, then ^ 
2^(d) _ ^jtj^ tjjjs remark in mind. Theorems 6.4 andh 
to this alternate definition of density. 



6.7 and their proofs adapt easily 
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